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Abstract 


Topological Data Analysis is a growing area of data science, which aims at computing and characterizing the geometry 
and topology of data sets, in order to produce useful descriptors for subsequent statistical and machine learning tasks. 
Its main computational tool is persistent homology, which amounts to track the topological changes in growing families 
of subsets of the data set itself, called filtrations, and encode them in an algebraic object, called persistence module. Even 
though algorithms and theoretical properties of modules are now well-known in the single-parameter case, that is, when 
there is only one filtration to study, much less is known in the multi-parameter case, where several filtrations are given 
at once. Though more complicated, the resulting persistence modules are usually richer and encode more information, 
making them better descriptors for data science. 

In this article, we present the first approximation scheme, which is based on fibered barcodes and exact matchings, 
two constructions that stem from the theory of single-parameter persistence, for computing and decomposing general 
multi-parameter persistence modules. Our algorithm has controlled complexity and running time, and works in arbitrary 
dimension, i.e., with an arbitrary number of filtrations. Moreover, when restricting to specific classes of multi-parameter 
persistence modules, namely the ones that can be decomposed into intervals, we establish theoretical results about the 
approximation error between our estimate and the true module in terms of interleaving distance. Finally, we present 
empirical evidence validating output quality and speed-up on several data sets. 


1. Introduction 


Topological Data Analysis (TDA) is an area of data science that has been developing quite fast and that 
has gathered the interest of many practitioners in the last few years, due to its success in various applications. At 
its core is the use of computational tools from algebraic topology to capture multiscale shape information from data, 
that require only mild assumptions about the data (e.g., a metric or similarity measure between points) in order to be 
applied. Moreover, a centerpiece of the formal foundations of TDA are mathematical guarantees that ensure the resulting 
descriptors are reasonably efficient to compute and robust to perturbations. As such, TDA has been applied success- 
fully in a wide range of scientific fields, including bioinformatics, computer graphics, and machine learning, among others. 


Persistent homology. The main computational tool of TDA is persistent homology (PH). Whereas homology is a 
descriptor of a topological space X, the core idea of PH is to study how the homology groups change when computed 
on a specific family of subspaces of X called a filtration of X. A filtration is a family F of subspaces of X indexed over 
a partially ordered set I: F = {X; C X}iez, that is nested wrt. inclusion, i.e., it satisfies X; C X; for any i < j. Then, 
the functoriality of homology and these inclusion induces morphisms between the corresponding homology groups 
H,.(X;) > H.(X;) for each pair i < j, which allows to detect the differences in homology when going from index i to 
index j. One of the most common ways to produce such filtrations is to study the sublevel sets of a continuous filter 
function f : X — R”, defined with F = {x € X : f(x) < u}uer»; where the partial order on the poset R” (denoted by 
<) is defined, for any a,b € R”, as a < b if and only if a; < b; for any 1 < i < n. 


Single-parameter PH. When Z is totally ordered, e.g., when J C R, then applying the homology functor H,(—; k) 
for a field k to a (single-parameter) filtration results in a sequence of vector spaces connected by linear transforma- 
tions. This sequence is called a single-parameter persistence module and has been studied extensively in the TDA 
literature [Car09][CdGO16][EH10][Oud15]}. Notably, one can show that such persistence modules can always be decom- 
posed into a direct sum of simple summands, which intuitively represent the appearances (birth) and disappearances 
(death) of topological structures detected by homology as the index increases. Moreover, single-parameter persistence 
modules can be efficiently represented in a compact descriptor called the persistence barcode, and several vectorization 
methods, as well as kernels and machine learning classifiers, have been proposed for such barcodes in the litera- 
ture [Bub15][AEK* 17] [RHBK15] [CCO17] |CCI* 20). As a consequence, most applications of TDA use single-parameter 
persistence modules, and often use the sublevel sets of, e.g., the data set scale, as the corresponding single-parameter 
filtration. 


Multi-parameter PH. However, many data sets come with not just one, but multiple, possibly intertwined, salient 
filtrations. For example, image data typically has both a spatial filtration and an intensity filtration. Arbitrary point 


cloud data can be filtered both by feature scale and density. Unfortunately, in general, the resulting multi-parameter 
persistence modules obtained by applying the homology functor to a filtration indexed over R” are much 
less tractable; contrary to the single-parameter case, there is no general decomposition theorem that can break down any 
module into a direct sum of simple summands such as, e.g., interval modules. 


Contributions. In this article, we build on the heuristic construction of [CB20] based on the fibered barcode [LW15] 
and propose the first approximate decomposition of general multi-parameter persistence modules in arbitrary dimension. 


1. We introduce a new candidate approximate decomposition, parameterized by an approximation parameter 6 > 0, 
that can be computed for any multi-parameter persistence module with running time 
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where N is the number of simplices and n is the number of filtrations (Algorithm|i]in Section B.3}, 


2. When computed over interval decomposable modules, we prove that the interleaving distance between our con- 
struction M and the module M it approximates is upper bounded under mild assumptions (Propositior5.5): 


d;(M, M) < dy(M,M) < ô. 


Even though our theoretical result only applies to interval decomposable modules, our candidate approximation 
M always has the same fibered barcode and rank invariant than the module M it approximates. Moreover, 
we hypothesize that our approximate decomposition is also stable for modules that are close to being interval 
decomposable, while being more powerful than the rank invariant. 


3. We perform numerical experiments that showcase the performance of this approximation and exhibit the trade-off 
between computation time and approximation error (Section[7}. 


Related work. There are several works in the literature that focused on the problem of computing or approximating 
multi-parameter persistence modules. 

When restricted to filtrations indexed over R?, decomposition theorems have been provided under strong assumptions 
about the filter functions [ABE*21][BLO20] [BLO22] [BL18] [CO19][Les15], as well as efficient algorithms for comparing 
these decompositions [KLO19] [KN20] [Vip20a]. Minimal presentations of bimodules of simplicial complexes can also 
be computed with Rivet in O(N°x + (N + logx)k?) operations, where N is the number of simplices, and 
K = KxKy is the product of unique x and y coordinates in the support of the module. Approximation schemes and 
methods to produce estimate modules have also been proposed with polynomial complexity, that are based on, e.g, 
Mobius inversions [AENY19], or rectangle summands [[DX21]. While these running times are comparable to ours, we 
substantially generalize these approaches since our approximation can be computed for any number of filtrations. 

For general multi-parameter persistence modules in dimension n, i.e., computed from filtrations indexed over R”, 
alternative descriptors of the multi-parameter persistence modules (that are complete under specific assumptions) have 
been presented [B0021] [CB20] |CFK*19] |Vip20b}, and a decomposition algorithm for modules computed on simplicial 
complexes and indexed over a grid has been proposed [DX22]. This algorithm has complexity O(N"?°+)), where N 
is the number of simplices (which can be typically more than cubic in the number of data points, depending on the 
homology dimension) and w < 2.373, and is thus very limited by the size of the input. Hence, computing approximate 
decompositions for multi-parameter persistence modules indexed over R” for arbitrary n € N* with controlled complexity, 
running time, and approximation error is still an open and important question, which we tackle in this article. 


Outline. Section [2|provides a concise review of multi-parameter persistence modules. In Section [3] we present our 
approximation scheme for general multi-parameter persistence modules, and in Sections [andj] we study the theoretical 
properties of our construction for interval decomposable modules. We also discuss in depth the exact matching parameter 
of our construction in Section[6] Finally, we illustrate the performances of our candidate in Section|7| 


2. Background 


In this section, we recall the basics of multi-parameter persistence modules. This section only contains the necessary 
background and notations, and can be skipped if the reader is already familiar with persistence theory. A more complete 
treatment of persistence modules can be found in [OQud15||\CdGO16||DW22]. A more precise description of multi-parameter 


persistence modules computed from filtered simplicial complexes can also be found in Section|6.3] 


2.1. Multi-parameter persistence modules 


In their most general form, i.e., regardless of whether they are computed from simplicial complexes, topological spaces, 
etc, multi-parameter persistence modules are nothing but k-vector spaces indexed by R” and connected by linear 
transformations (where k denotes a field). 


Definition 2.1 (Multi-parameter persistence module). An n-multi-parameter persistence module (or n-multipersistence 
module for short) is a covariant functor M from R” to the category of k-vector spaces, M : x € R” œ> Mx. The linear 
transformations {p? : Mx > My}x,yer",x<y are called the transition maps of M. In particular, functoriality imposes the 
following property on the transition maps: yy = 97, © oy for any x < y < z. 

A morphism between two n-multipersistence modules M, N with transition maps g: and y; respectively, is a collection 
of linear maps f = {fy : My > Nx}xern, called an n-multipersistence morphism, that commutes with transitions maps, 
i.e., one has fy o py = Y% © fy, for all x < y. 


Multipersistence modules can be compared with the interleaving distance ||Les15], which is one of the most commonly 
used distances in TDA, and which is based on the shift functor. 


Definition 2.2 (Shift functor). Let v € R”. The v-shift functor is the endofunctor (-)(v) that maps an n-multipersistence 
module M (resp. an n-multipersistence morphism f) to M(v) (resp. f(v)) defined, for any x € R”, as M(v)x := Mx+v 
(resp. f(v)x := fx+o). 


Definition 2.3 (Interleaving distance). Given € > 0, two n-multipersistence modules M and N are ¢-interleaved if 
there exist two morphisms f: M — N(e) and g: N — M(e) such that g(e) o f = + and f(e) o g = y.***, where 
€=(e,...,€) € R”, and g and y are the transition maps of M and N respectively. 

The interleaving (pseudo)distance between two multipersistence modules M and N is then defined as 


d;(M, N) = inf {e => 0: M and N are e-interleaved} . 


Another usual distance is the bottleneck distance ||BL18} Section 2.3]. Intuitively, it relies on decompositions of the 
modules into direct sums of summands, and is defined as the interleaving distance between these summands. As such, it 
first requires the definition of a matching between summands. 


Definition 2.4 (Matching). Given two multisets A and B, uy: A & B is called a matching if there exist A’ C A and B’ C B 
such that u: A’ — B’ is a bijection. In the following, we let im(y) = B’ and coim(j:) = A’. 


Moreover, in order to define meaningful decompositions, summands are required to be indecomposable modules. 


Definition 2.5 (Indecomposable module). A multipersistence module M is indecomposable if 
M=A@®B=> M~AorM~B 


Since decompositions of multipersistence modules are unique [Oud15], the following distance is well-defined. 


Definition 2.6 (Bottleneck distance). Let M = @,-; Mi and N = Dies N; be two multipersistence modules decom- 
posed into indecomposable summands. Given ¢ > 0, the modules M and N are e- matched if there exists a matching 


o: I # J such that 
1. for all i € £\coim(o), M; is -interleaved with the null module 0, 
2. for all j e J\im(o), N} is e-interleaved with the null module 0, 
3. for all i € coim(c), M; and No ;) are €-interleaved. 


The bottleneck distance, denoted by dp, between two multipersistence modules M and N is then defined as 
d,(M, N) = inf {e => 0 : M and N are e-matched}. 


Since a matching between the decompositions of two multipersistence modules induces an interleaving between the 
modules themselves, it follows that dr < dp. Note that the bottleneck distance can actually be arbitrarily larger than the 
interleaving distance, as showcased in [BL18) Section 9]. 


2.2. Interval modules 


The study of multipersistence modules is easier when restricted to a specific class called the interval modules. For instance, 
all of our theoretical results presented in Sections [4jand[5Jare stated for modules that can be decomposed into intervals. 
Hence, in this section, we define such interval modules. Intuitively, they are modules that are trivial, except on a subset 
of R” called an interval. 


Definition 2.7 (Interval). A subset I of R” is called an interval if it satisfies: 
e (convexity) if p,q € I and p < r < q then r € I, and 


e (connectivity) if p,q € I, then there exists a finite sequence r4, r2, ..., Fm € I, for some m E N, such that p ~ rı ~ 
r2 ~ +++ ~ rm ~ q, Where ~ can be either < or >. 


Definition 2.8 (Indicator module, Interval module). An n-multipersistence module M is an n-indicator module if there 
exists a set S C R”, called the support of M and denoted by supp(M), such that: 


and Vx,y €R”, ox = 


Vx € R”, Men eos i 


0 otherwise 0 otherwise 


where 9: are the transition maps of M. If S is an interval, M is called an n-interval module. 


An important consequence of modules is that whenever two points are in the support of an indicator module, then 
the whole rectangle induced by those points must be in the support as well, as stated by the following lemma. 


Lemma 2.9. Let I be an n-indicator module. Then one has a,b € supp(1) & Rap © supp(J), where, given two points 
a,b € R”, the corresponding rectangle Rap is defined as Ray := {x € R” : a < x < b}, and Rap = Ø ifb < aor ifa and b are 
not comparable in R”. 


Proof. Since & is trivial, we only prove =>. If x £ y, then Ry y is empty and the result holds. Otherwise, if R,,y is 
not empty, let z € Rx,y, ie, x < z < y, and let g’ be the transition maps of I. Since ọ¥ = gy? o g% = id, one has 
1 > dim L = dim I, = dim I} = 1 (see Definition|2.8}, which means that z € supp(J). m 


Note that one cannot use any set S for defining an indicator module, since transition maps of modules must satisfy 
some properties coming from functoriality (see Definition[2.1}. However, one can defined a module induced from a set 
using the following definition. 


Definition 2.10 (Induced module). Given a subset S C R”, the indicator module Ind (S) induced by S is defined as the 
indicator module with support {x € R” : Ja,b € S such that a < x < b}. 


Finally, interval decomposable modules are then defined as those multipersistence modules that are made of intervals. 


Definition 2.11 (Interval decomposable module). An interval decomposable module M is a multipersistence module that 
is isomorphic to a direct sum of interval modules. 


Note that for rectangle decomposable modules, i.e., interval decomposable modules whose supports are rectangles 
in R”, it is possible to control the bottleneck distance more precisely with dp < (2n — 1)dr |Bje20| Theorem 4.3]. In the 
following, we present a few properties of interval modules that are often very useful for their theoretical analysis. 


Definition 2.12 (Discretely presented interval module). An n-interval module I is discretely presented if its support is a 
locally finite union of rectangles in R”, and whose boundary is an (n — 1)-submanifold of R”. More precisely, there exist 
two families of points, the birth and death critical points of I, denoted by Cg(I) and Cp(I) respectively, such that: 


I= Ind U U Rao |. (1) 
ceCpg(I) c’ECp (I) 


A useful property of interval modules is that they can be described with their upper- and lower-boundaries, also called 


upsets and downsets |Mil20| Section 1.4]. 


Definition 2.13 (Upper- and lower-boundaries). Given an interval I, its upper-boundary U[I] and lower-boundary L[I] 
are defined as: 


LU] ={xel:WeR y <x =y¢I}, UU] ={xel:WeR  y>x>ye¢l} 
Moreover, the boundary of supp(I) can be decomposed with supp (I) = L[I] U U[I]. See Figure []for an illustration. 


When interval modules are discretely presented, their lower- and upper-boundaries are made of flat parts, which 
are the faces of the corresponding rectangles forming the module. Hence, we call facets the subsets of the lower- and 
upper-boundaries that are included in some hyperplanes of R”. 


Definition 2.14 (Facet). A lower (resp. upper) facet of an interval I is an (n — 1)-submanifold of dsupp(J) written as 
{x € R” : x; = c} N L[I] (resp. {x € R” : x; = c} N U[I]) for some c € Rand some dimension i € [[1, n]] that is called 
the facet codirection. In particular, the upper- and lower-boundaries of a discretely presented interval module is a (locally) 
finite union of facets. 


2.3. Interval morphisms 


For indicator modules, there is only one possible transition map, the identity (up to an invertible scalar). This induces a 
canonical way to define morphisms between indicator modules (and thus between interval modules as well). 


Definition 2.15 (Indicator module morphisms). Let I and I be two indicator modules. The collections of linear maps 


ue and ot, between I and J are called indicator module morphisms, and defined with 


(2e) (2e) 


and DS A (2) 
AN pan oi a 2 


where e = (¢,...,€) € R” and where gn, is defined, for x € R”, by 
(91?) xt Ie = k or {0} — Äere = k or {0} 


E k if x + € € supp(J) 


0 otherwise 


(e) 


and vice-versa for Pir Note that o, and g define an ¢-interleaving between I and I if they commute. 


2.4. Fibered barcode 


The fibered barcode is a centerpiece of our approximation scheme, and is defined, given an n-multipersistence 
module M, as a map that takes as input a line (or segment) / in R”, and outputs the persistence barcode associated to the 
single-parameter persistence module obtained by restricting M along l. Hence, in the following, we formalize and define 
intersections between multipersistence modules and lines in R”. 


Definition 2.16. Given a line l C R?, we let 1; denote the induced functor 1; : L — R?, where L is the full subcategory of 
R? associated to 1. The module M|, := M o ų is called the restriction of M to L. 


Remark 2.17. When M = @;ez Mi is decomposable into indicator modules, the support of the restriction of M to l 

is a set of segments called bars, and aggregated in a barcode: B(M|) = supp(M|,) = (supp(mil,)). 5 Note that this 
tE 

barcode corresponds exactly to the barcode defined in the theory of single-parameter persistence, computed on the 

single-parameter filtration induced by | C R?. 


Definition 2.18 (Fibered Barcode). Let M = @,_; M; be a pointwise finite-dimensional n-multipersistence module. 
The complete fibered barcode of M is defined as the family of barcodes CF B(M) = {B(M|,) : le L}, where £ denotes 
the set of diagonal lines in R”, i.e., those lines with direction vector 1 = (1,...,1) € R”. Given a (possibly discrete) family 
of diagonal lines L C L, we let the L-fibered barcode (or fibered barcode for short when L is clear) be the restriction of the 
complete fibered barcode to L, i.e., FB(M)z = {B(M|)) :le L}. 


It is also useful to characterize intersections between modules and lines using the endpoints of lines. 


Definition 2.19 (Birthpoint, Deathpoint). Given a point x € R” and an indicator module I, we call bl :={x+6:5€R}N 
L[I] (resp. d} = {x +6 : 8€ R} N U[I]) the birthpoint (resp. deathpoint) associated to x and I (see Figure [1] for an 
illustration), where ô = (6,...,5) € R”. Since it follows from the definition of L[I] and U [I] that b} and d} are singletons, 
we slightly abuse notations and use b} and d? to also refer to the unique element these sets contain. When these sets are 
empty, or bL = dl, we say b} and d! are trivial. 

Similarly, given a diagonal line / C R” (i.e., a line with direction vector (1,...,1) € R”), we define the birthpoint (resp. 
deathpoint) associated to l and I as b! := bl (resp. d := d!) for any x € I. 


shi 
7 bz 
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Figure 1: Lower- and upper-boundaries of a 2-interval (Definition ; and birthpoints and deathpoints b! and d! 
(Definition 2.19} of a point x € R?. 


Remark 2.20. The rectangle Rap induced by two birthpoints or deathpoints a, b of the same indicator module is always 
flat, i.e., at least one of its sides has length zero, as demonstrated by Figure |2| 


dy 


d2 


Figure 2: Two bars [b,,d,] and [b2, d2] of some indicator module; if Rp, », is not flat then, by Lemma|2.9| b, cannot be a 
birthpoint since it would be possible to find a smaller birthpoint w.r.t. the partial order of R” along the diagonal line 
passing through by. 


Remark 2.21. Using birthpoints and deathpoints, the L-fibered barcode of an interval decomposable multipersistence 
module M = Diez Mi; is written as: 


F B(M), = {B(M|,) : 1€ L} = {(1b}“, d“ ier : l€ L}. (3) 


Note also that bars of the fibered barcode that are associated to lines that are close to each other must have similar 
length, as stated in the lemma below, which is very similar to Lemma 2]. 
Lemma 2.22. Let I be an indicator module, let l,l} C R” be two diagonal lines and let @ € R” be a positive or negative 
vector (i.e., the coordinates of? are either all positive or all negative) such that l} = lı +0. Assume that the barcodes B(I|,) 
and B(I|,,) are non empty, and let [b! ,d! | and [b] ,d! | be the corresponding bars in R”. Then, one has 


L,? l lz? l 
< [Pll 


I yl 
i-a, 


1 


I_ pI 
oe [fel]. and |, -b, be 
where we used the conventions (+00) — (+00) = (—00) — (—0oo) = 0. 

Proof. If one of the endpoints is infinite, the result holds trivially, so we now assume that the endpoints of the bars are all 
finite. Without loss of generality, assume that |, = lı +0 where @ is positive. Now, since both di, and di, +7 belong 
to l2, they are comparable, so one has either di, > d;, +7 or di, < dy +7. However, the first possibility would lead to 
di, > d;, +7 > di, , hence di, and di, would be (strictly) comparable in R”, which contradicts Remark[|2.20| Thus, one 
must have dj, < d; +0. Furthermore, and using the exact same arguments, dı, -7 + Izi -1is on 4, and one must 


have dj, -7 + "||. -21> di, . Finally, by combining the two previous inequalities, one has 


dy =le 1 d +2 -llo tsa, < dr +8 < d + o L 


which leads to the desired inequality for deathpoints. The proof extends straightforwardly to birthpoints. o 


3. General multipersistence module approximation 


In this section, we present the first approximation scheme that works for any multipersistence module in arbitrary 
dimension, i.e., with arbitrary number of filtrations. In particular, one does not have to assume that the underlying 
module is decomposable in order to apply our method. Our candidate approximation works by sampling the underlying 
module with an ordered family of diagonal lines, computing the associated fibered barcode, and finally connecting the 
endpoints of bars in consecutive barcodes in a specific way. Note however that the theoretical properties that we show 
for our approximation in Sections|4]and[5]are valid only when the underlying module is interval decomposable. We 
first provide in Section3.1|a few specific examples of approximation schemes based on fibered barcodes that we present 
to gather intuition and motivation for our main construction, that we then present in Section[3.2] The corresponding 
pseudo-code and algorithm are given in Section[3.3] 


3.1. Motivation 


The goal of this section is to frame the general question of reconstructing a multipersistence module from its fibered 
barcode. There are many ways of doing so, but the most natural ones are not necessarily the easiest computable ones. 


For the sake of simplicity, assume that the underlying module is a single interval module M = I (see Definition|2.8). 
Since interval modules are characterized by their supports, the goal is to recover supp(I). Moreover, if I is discretely 
presented, one can find an exact sequence of graded modules 


R-Go>I-0, 


such that the critical points of supp(J) (see Equation ap provide bases for the modules G and R (we recall that, intuitively, 
G and R represent the homology generators and relations respectively). Technical details for finding such exact sequences 
can be found in, e.g., Appendix A], and an example of such a sequence is given in Figure f] Hence, only the facets 
or critical points of supp(I) need to be captured or approximated in order to recover I when it is discretely presented. 
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Figure 3: Let I be the interval whose support is colored in grey, and let g; denote the transition maps of I. The 
graded modules R and G can be constructed as follows: G is defined as the free graded module G := (g1, go, g3) (where 
the grades of g1, gz and g3 are (x1, y3), (x2, y2) and (x3, y1) respectively, i.e., they are given by their positions in the 
figure), and R is defined as the (not necessarily free) graded module R := (r1, . . ., r5} (where the grades of the r;’s, are 
(x1,y3)+( ys ys) €2 (x3, y1)+(ys- yi) e2 (x3,y1)+(x4—x3) e1 


also given by the figure). Since one has r; = P (xy, ys) (91), r2 = Pay) (g3), r3 = Pii) (93), 
ra = CEID (gy — gf g), and rs = pII gy) — pOr g), it follows that 


R —> G —> I > 0 is an exact sequence. 


There are many different ways, for a given n-interval module J, to define candidate critical points, that we call corners, 
using the endpoints of its fibered barcode, e.g., by using the minimum and maximum of consecutive endpoint coordinates. 
Hence, it is natural to define our candidate approximation I with model selection, i.e., by minimizing some penalty cost 
pen: S — R+, where S is the set of discretely presented interval modules having the same fibered barcode as I, or a 
subset thereof. See Figure [4] for examples of sets S and corresponding candidate approximations. This penalty would 
forbid, e.g., overly complicated approximations that have a lot of corners. For instance, minimizing the penalty 


pen: 1+ 4#corners of supp(J). (4) 


would provide a sparse approximation of I. Actually, when one assumes that the underlying interval module I is discretely 
presented with facets that are large enough with respect to the family of lines L of the fibered barcode (see Proposition 
[4.4] for precise statements), it is easy to show that I minimizes penalty (4). Indeed, as all the facets are detected by some 
endpoints of the fibered barcode, any candidate has at least the same number of facets than I. 


UL 


Figure 4: Example of candidates for a 2-interval module I with support in R°. (Left) Given the L-fibered barcode of J, 
where L is the family of the four black lines, we want to approximate I with an element of S, i.e., an interval module with 
the same fibered barcode. (Middle) When one further constrains the set S by asking to have at most one corner between 
two consecutive endpoints of the fibered barcode, the whole set S can be computed explicitly. (Right) The set S can also 
be described as the set of intervals which have to go through the blue path, and which can arbitrarily choose between 
the red or green path at three different locations. Hence, the cardinality of S is 2°. 


Remark 3.1. For n-interval modules, S is generally a set of cardinal c¢, where c is the number of possible corners between 
gn-l birthpoints or deathpoints, and d is the number of corners. For instance, in Figure fa] one has n = 2, c = 2 and 
d = 3. Unfortunately, c grows exponentially with the dimension n, and d is difficult to control in practice, since it heavily 
depends on the number of lines in the fibered barcode and the regularity of the underlying interval module I. Minimizing 
a penalty over S is thus practical only for low dimension n and small number of lines in the fibered barcode. Hence, our 
general approximation scheme presented in Section|3.3|does not use penalty minimization, but is rather defined with 
arbitrary corner choices. 


Note also that there are cases when the corner choices are canonical. For instance, any 2-multipersistence module M 
with transition maps q: that is weakly exact, i.e., that satisfy, for any x < y 


im (02) =im (ot) Nim GAR and ker (02) = ker (oe) + ker (oo) ; 


is rectangle decomposable [[BLO22]. Hence, a canonical approximation of a summand I of M is given by the interval 
module whose support is the rectangle with corners (min;(bi), min; (b})2) and (max; (by)1, max;(b!)2), where | goes 
through the family of lines L of the fibered barcode. 


3.2. Line families, corners and exact matchings 


In this section, we provide three additional definitions that turn out very useful for describing our approximation scheme 
in Section as well as for proving associated guarantees for interval decomposable modules in Sections |4]and[5| 


We first introduce a few notations: we let (e;,..., €n) be the canonical basis of R”, d,, denote the |] - || distance in 
R”, and, given a set A C R” and ô > 0, we let Aĉ denote the 6-offset of A, defined as Aĉ := {x € R” : d(x, A) < ô}, 
and we let conv(A) denote the convex hull of A. Moreover, given a hyperplane H c R” and its two associated vectors 
ay, by € R” which satisfy H = by + {x € R” : (x, ay) = 0}, we call ay the codirection of H (similarly to the codirection 
of facets, see Definition 2.14). Finally, when ay is a vector in the canonical basis of R”, i.e., there exists i € [[1, n]] such 
that ay = e;, we slightly abuse notation and also call i the codirection of H. 

Our first definition characterize those families of lines that evenly cover compact sets in R”. 


Definition 3.2 (6-regularly distributed lines filling a compact set). Let L be a set of diagonal lines in R” and K C R” be 
a compact set. Then, we say that : 


1. two diagonal lines l, l’ € L are 5-consecutive (or simply consecutive when 6 is clear) if there exists u € {0,1}” \ {0,1} 
such that l’ = l + du. 


2. two diagonal lines I, I’ € L are 5-comparable if there exists a positive or negative vector Y € R” with IEAI <6 
such that l’ = 1+ 7, where 7@ is said to be positive (resp. negative), written as Y > 0 (resp. @ < 0), if and only if 
(2); = 0 (resp. (7); < 0) for all i € [[1, n]. If @ is positive (resp. negative), we write l’ > I (resp. I’ < D). 

3. Lis 6-regularly distributed if, for any pair of lines (1, 1’) € L, there exists a sequence of 6-consecutive lines {l4, . . . , lg} 
in L such that l = 1, and I’ = Ik. 


4. for a given line l in a 6-regularly distributed family of lines L, we call L; := L A {I +0: 0 € {0,1}! x {0} the 
L-surrounding set of I. In particular, one has |I;| < 2”7}. 


5. L 6-fills K if any point of K is at distance at most 5/2 from some line in L. More formally, K is included in the offset 
L°/2, When ô is clear from the context, we simply say that L fills K. 


Remark 3.3. Let L be a set of 6-regularly distributed diagonal lines that 6-fills some compact set K C R”. Then L is 
actually distributed over a grid (along the canonical axes of R”) on K. More precisely, assume that there is al € L such 
that 0 € J. Now, assume that there exist integers a@1,...,@, € Z such that x = (a@16,...,@,0) € K. Then, using items (3) 
and (5) of Definition[3.2| there must exist some line ly € L such that x € ly. Hence, to be more concise, we will call such a 
set of lines a 6-grid of K. 


Families of lines that are 6-grids of K allow to formally define candidate critical points, or corners, that can be used to 
approximate the critical points of the true underlying interval module (see Section[B.1|above). In the following definition, 
we introduce points called corners that can be defined solely from the fibered barcode of an interval module J, and that 
we will use as proxies for the critical points of I (as per Equation (1)) in our approximation scheme. 


Definition 3.4 (Birth and death corners). Given a discretely presented interval J with birth and death corners included 
in a compact set K C R”, and a ô-grid L of the offset K*°, we say that b is a finite (L-)birth corner (resp. d is a finite 
(L-)death corner) if: 


1. for each dimension i € ||1, n]], there exists an hyperplane H; of codirection i intersecting K, and the family (Hj); 
satisfies (|; H; = b (resp. (); Hi = d), 


2. there exists a line lọ € L such that 


(a) b € conv(L,,) (resp. d € conv(L;,)), where Ly is the L-surrounding set of lọ (see Definition [3.2} 
(b) for each line l € Lp, the endpoint bi (resp. di) is non trivial, 


(c) for each dimension i € |1, n]], there exists l; € L such that by € Hi. 


and we say that b is a pseudo (L-)birth corner (resp. d is a pseudo (L-)death corner) if: 


1. there exists a set J C ||1, n]], called the codirection of b (resp. d) and denoted with codir(b) (resp. codir(d)), such 
that for each dimension j € J, there exists a hyperplane of codirection j intersecting K such that (); H; > b (resp. 
N; Hj 3 d). The set l1, n]]\J is called the direction of b (resp. d) and is denoted with dir(b) (resp. dir(d)). 


2. there exists a line lọ € L such that 


(a) b € conv(L),) N K?Î\K (resp. d € conv(Lp) N K*\K), 
(b) for each line l € L}, the endpoint b! (resp. d!) is non trivial, 
0 P l P. 


(c) for each dimension j € J, there exists l; € Ly such that by € Hj. 
J 


A pseudo birth (resp. death) corner b is said to be minimal (resp. maximal) if for any other pseudo birth corner b’ 
(resp. pseudo death corner d’) such that codir(b’) C codir(b) (and thus dir(b’) 2 dir(b)), there exists a dimension i such 
that b; < b; (resp. d; > d/). 

Finally, we say that b (resp. d) is an infinite (L-)birth (resp. death) corner if there exists a minimal (resp. maximal) 
pseudo birth (resp. death) corner b’ (resp. d’) such that b; = —oo (resp. d; = +00) if i € dir(b’) (resp. i € dir(d’)) and 
bi = b; (resp. di = d;) if i € codir(b’). 


Finally, in Section[3.1]and Definitions[3.2|and[3.4]above, we have assumed that the true underlying module was a 
single interval module. In order to handle more general multipersistence modules, we need a way to be able to distinguish 
between the bars of the fibered barcodes of different summands (when the module is decomposable). This is precisely 
the role of exact matchings, which we define below. In other words, exact matchings are functions that connect bars 
of different barcodes from the fibered barcode in a way that is consistent with the decomposition of the underlying 
multipersistence module M. 


Definition 3.5 (Exact matching). Let M = €),.; Mi be an n-multipersistence module. Let I, l’ be two lines in R”. A 
map m between the corresponding barcodes m: B(M|)) > B(M|,) U {Ø} is called a matching between l and I’ if the 
restriction of m to m (B(M]|,)) is injective. 

Furthermore, if we also assume that M is interval decomposable, i.e., that the M;’s are interval modules, then we 
say that the matching m is exact if one has 4 (b) = 12(m(b)) for any bar b € B(M|), where 1; : B(M|) — Jf and 
ly: B(M| y) `> Z are correspondences between the bars of barcodes in the fibered barcode and the interval summands of 
M, obtained by Equation (3). In other words, bars that are matched under m correspond to the same underlying interval 
summand of M. 


There are many ways of defining exact matching functions. For instance, under some assumptions, matchings induced 
by the Wasserstein distance between barcodes are exact (see Section|6.1), and we prove that the matching given by the 
vineyard algorithm is exact in Section|6.3] 

We are now equipped for stating our approximation scheme, which constructs a candidate module by computing 
corners from the fibered barcode and exact matchings. 


3.3. Algorithms for approximating modules 


In this section, we provide Algorithm|]] that can approximate any multipersistence module using the fibered barcode of 
an appropriate set of lines. Roughly speaking, Algorithm[1|works in three steps: 


1. compute the L-fibered barcode of the underlying module, 
2. match bars that correspond to the same underlying summand together using an exact matching, and 


3. for each summand, use the endpoints of the corresponding bars to build compute corners, using Algorithm|2| 


Step 1 can be performed using any persistent homology software (such as, e.g., Gudhi, Ripser, Phat, etc), or with 
Rivet when n = 2. Our code can be found at and is based on the 
vineyard update algorithm [CSEM06], which allows to run steps 1 and 2 jointly (see Section|6.3). 

Note that while we can guarantee that the output is close to the underlying multipersistence module M only when M 
is decomposable into interval summands (see Sections Sen Algorithm[i|makes no assumption about M at all and can 
be applied generally. Note however that since Algorithm|1]always returns a multipersistence module that is interval 
decomposable, the output decomposition is obviously wrong if M is not decomposable. 


Algorithm 1: APPROXIMATEMODULE 


Input 1: Multipersistence module M, 

Input 2: Family of lines L which is a 5-grid of the offset K*° of a compact set K c R” 
Input 3: Exact matching m 

Output: Interval decomposable multipersistence module M 

Compute F B(M),, i.e., the L-fibered barcode of M; 

S < []; # S is the set of interval summands, intialized as the empty set 

for l € L do 


# If S is empty, populate it with the first barcode, each bar initializing a new summand 
if S == [] then 
for [b™, dM] € B(M|,) do 
Be {[bM, aM]}; 
S.append(B); 
end 
end 
# If S is not empty, process each bar in the current barcode 
else 
for [bM, dM] € B(M|,) do 
# Check whether it is in the image of the exact matching 
if 3B € S and [b,d] € B s.t. [b™, dM] = m([b, d]) then 
| B.append( [ey d™]); # If it is, attach the bar to the corresponding summand 
end 
# Otherwise initialize a new summand with the bar 
else 
Be {[bM, dM]}; 
S.append(B); 
end 


end 
end 


end 
# For each summand in S characterized by a set of bars, build an approximate interval summand by computing candidate corners 
for B € S do 
| I(B) — APPROXIMATEINTERVAL(B); 
end 


Return M := Dares I(B); 


We now describe the algorithm APPROXIMATEINTERVAL, which is used at the end of Algorithm[1] Our algorithm 
APPROXIMATEINTERVAL is defined in two steps: 


1. first, we label birthpoints and deathpoints to identify the facets of I (Algorithm]3), 


2. then, we use these labels to compute the interval corners (Algorithm |4). 
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Once corners are computed, one can use them as approximate critical points and in order to output a discretely 
presented interval module with Equation (ih. 


Algorithm 2: APPROXIMATEINTERVAL 
Input: Set of bars B = {[b;, dı] : 1 € Lg € L} 
Output: Discretely presented interval module I parameterized by a list of birth and death corners 
labs — LABELENDPOINTS(B); 
C(I), Ci (I) — COMPUTECORNERS(B, labs); 


Return [(B) := Ind (Uccso Uereck i Rer): 


We first describe LABELENDPOINTS. The core idea of the algorithm is, for a given bar in I, to look at its corresponding 
L-surrounding set (see item (4) in Definition[3.2). If there exists a hyperplane H such that all endpoints in this surrounding 
set belong to H, we identify H as a facet, and we label the bar with the codirection of H. 


Algorithm 3: LABELENDPOINTS 
Input: Set of bars B = {[b;, dı] : 1 € Lg € L} 
Output: List labs of labels for each endpoint in B 
for | € Lg do 

labs(by) — []; 

labs(d;) <— []; 

end 

for l € Lg do 

if Ji € [[1,n]] and c; € R, such that VI’ € Lı, (by); = c; then 

for l’ € Lı do 
| labs(by).append(i, ci); 


end 
if Ji € |1,n]| and c; € R, such that VI’ € Lı, (dy); = ci then 
for l’ € Lı do 
| labs(dy).append(i, c;); 


end 
Return labs; 


Note that endpoints can have zero or more than one label. For instance, an endpoint that belongs to the intersection 
of several facets might have multiple labels. However, if several labels are identified, they must be associated to different 
dimensions. See Figure[5|for examples of label assignments when the underlying interval module has rectangle support. 


€3 


€2 
€i 


Figure 5: Example of birthpoint labelling for an interval module I with rectangle support with three surrounding sets of 
lines Lı, Lọ, Lj, associated to three lines l, lz, lz. The labels of l, lz, l; that are identified correspond to the red, blue and 
grey colored facets of I respectively. 


Remark 3.6. Detecting facets with 2"! endpoints sharing the same labels is not necessarily optimal. For instance, a 
rectangle module can be recovered with only three bars passing through it in R’. However, it does allow for simpler 
proofs. 

Finally, we describe CompUTECORNERS. The core idea of the algorithm is to use the labels identified by LABELEND- 
POINTS to compute candidate corners in the following way: if all birthpoints (resp. deathpoints) in a surrounding set 
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have at least one associated facet, i.e., have a non-empty list of labels, then a candidate corner can be defined using 
the minimum (resp. maximum) of all birthpoints (resp. deathpoints) coordinates. We only present the pseudo-code for 
birthpoints since the code for deathpoints is symmetric and can be obtained by replacing min by max and —oo by +oo. 


Algorithm 4: CompUTECORNERS 
Input 1: Set of bars B = {[b;, dı] : 1 € Lg S€ L} 
Input 2: List labs of labels for each endpoint in B 
Output: List of birth corners Cg 
Cs © []; 
for l € Lg do 
Br, = {br ile Lin Lp}; # Note that Br, S B by construction 
# Check whether all birthpoints in the surrounding set belong to the support K of the critical points of the underlying module 
if Bz, C K then 


# Compute birth corner if all the birthpoints are labelled 
if labs(b) + Ø, Yb € By, then 
{GU cj) :jeT}e Uses, labs(b); # J c [1,n] is the corresponding set of codirections 


Define C! € R” as 
° (Dj=cgifjeT 


. (CD; = min {(by); PELNA Lr} otherwise 


Cp.append(C’); 
end 
# If the birthpoints are not all labeled, we simply keep the birthpoints themselves as corners 
else 
for l’ € Lı A Lg do 
| Cg.append(br); 
end 


end 

end 

# If some birthpoints are not in K, they must correspond to infinite facets 

else 

Assert Bz, N K?Î\K + Ø; 

Assert labs(b) + Ø for all b € Bz; 

{GU cj) :jeT}- Uses, labs(b); # The cardinality of the set of codirections J ¢ ||1, n] must be strictly less than n 
Define C! € R” as 


° (Dj=cgifjeT 


. (C); = —œ otherwise 


Cp.append(C’); 
end 


end 
Return Cz; 


Note that, by construction, the candidate corners computed by CompuTECORNERS are all finite, pseudo or infinite 
L-corners, as defined in Definition [5.4] Now that we have defined how to compute an approximation, in the following 
sections, we will now show that the approximate multipersistence module M provided by Algorithm|1|is a good approxi- 
mation of the underlying module M when M is interval decomposable. 


Complexity. Computing the L-fibered barcode FB (M); on a simplicial complex, as well as assigning the corresponding 
bars to their associated summands in the decomposition of M, can be done with the vineyard algorithm and match- 
ing with complexity O(N? + |L| - N - T), where N is the number of simplices in the simplicial complex, and 
T is the maximal number of transpositions required to update the single-parameter filtrations corresponding to the 
consecutive lines in L. In the worst case scenario, T = N?. Note that T usually decreases as |L| increases, and that this 
computation can be easily parallelized (see Section|7). 

Now, adding the complexities of Algorithms[3|and{4] the total complexity of Algorithm[I]is 


O(N? + |L|-N-+ T+ |L|-n- 2774). 


Of importance, the dependence on n is much better than the (exact) decomposition algorithm proposed in [DX22] whose 
complexity is O(N"°*)). It is also better than Rivet [LW15] (which works only when n = 2), whose complexity is 
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O(N°x + (N + logk)xk?), where K = KyKy is the product of unique x and y coordinates in the support of the module. 
Moreover, our complexity can be controlled by the number of lines, which is user-dependent. Again, we illustrate this 
useful property in Section|7| 


3.4. Endpoint properties 


In this section, we prove a few preliminary results about endpoints of interval modules, that will turn out very useful for 
quantifying the error made by Algorithm[i|when approximating the true endpoints of a multipersistence module with 
L-corners, as we do in Sections |4]and [5] Roughly speaking, we prove in this section that the location of endpoints is 
related to the rectangle hull of other endpoints corresponding to lines in some specific surrounding sets. 


Definition 3.7. Let S C R”. The rectangle hull of S, denoted by recthull[S], is defined with 
recthull[S] := |x ER": Vie [1 n], min Si < Xi < max s} = Rasvs, 
sE SE 


where (AS); := minses si and (VS); := maxses Si- 


Lemma 3.8 (Endpoints bound). Let I be an n-interval module. Let 6 > 0, K be a compact set of R” and L be a 5-grid 
of K. Let x € K®, ly be the diagonal line passing through x and dl € U[I] be the associated deathpoint. Finally, let 
Lyg = {l€ L : doo(x,l) < ô and ly, l are 5-comparable}, which is non-empty since L fills K*°. Assume that for any line 
Lin L,.5, one has supp(I) N l + Ø, and let Ds be the set of the associated deathpoints: Ds = {di : Le Lys}. Then, dl 
belongs to the rectangle hull of D} s one has d} € recthull[D} 5]. 

Similarly, ifb}. € L[I] is a birthpoint, then b} € recthull [B] 5] where BI 5 is the set of birthpoints associated to Ly 5. 

In other words, the endpoints of an interval module always belong to the rectangle hull of the endpoints associated 
to neighbouring lines. See Figure|6|for an illustration. 


Figure 6: Example of deathpoint bound in R?, with d € U[I], and D! 5 = {d;, dz, dz, d4}. (Left) Rectangle hull of the 
deathpoints D! s: (Right) Upper-boundary U [I]. 


Proof. We first prove the result for deathpoints. Note that the result is trivially satisfied if d} and the deathpoints in D! 5 
are infinite, so we assume that they are finite in the following. Let j € [[1, n]] be an arbitrary dimension. To alleviate 
notations, we let d := d}. In order to prove the result, we will show that there exist two deathpoints d and d associated to 
consecutive lines of L,.5 such that d; <dj< dj. 


Construction of d, d. Let Hj; be the hyperplane H; = d+ e7. Since L fills K? and x € Kĉ, there exists a diagonal line 
l € L such that dœ(x, 1) < 6/2. Moreover, since l and ly (the line passing through x and d) are both diagonal, one has 
dœ(d, I) = dx(x, I) < 5/2. Let 7;(d) € lbe the projection of d onto l that achieves d,,(d,1), and let di := L N Hj. See 
Figure[7|for an illustration of these objects. 


di 


Figure 7: Illustration of Hj, d, 1, di. 
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Since d/ and d belong to Hj, they have the same j-th coordinate: d} = dj. Moreover, both d/ and 7;(d) belong 
to the diagonal line l, hence they are comparable, and ||d/ — 7)(d)||oo = |(d/ — 7ı(d));| for any i € [1,n]]. Then, one 
has ||d! — doo < |d — mdli + [| 171(d) — dllo = |(d? — m(d));| + Ilm (d) - dllo = (d — m(d))j| + llt1(d) — dlloo < 
2\|)(d) — dll% < ô. l 

Let dt = d? +ô), je gr ej andd” = d’ — ô X, je g ej, where 


gre {i e [L nJ\U} : di < di and J” = {i €[nJ\U}:4 > di}. 


By construction, one has d~ < d < d* € Hj and ||d* — d|lo, ||d~ — dlloo < ô. Since | and the diagonal lines / and l passing 
through d~ and d* respectively are 5-consecutive, and since x € K®, the projections of x onto | and / are in K*®, and 


thus Z, | must belong to Lx, as by construction ly is d-comparable with the diagonal lines L and L Let d := di € land 


d := d € I be their deathpoints (which exist by assumption). 


Proof of inequalities. We now show that dj > dj > d,. We start with the second inequality. Since d* and d are one the 
same diagonal line, they are comparable. Buoda if one had d* < d by contradiction, then the induced rectangle 
Ra,a would not be flat since d < d* < d, which would contradict Remark(2.20] 0| As a consequence, d* > d. Taking the j-th 
coordinate yields dj = d; > d; The first inequality holds using the same arguments. 


This proof applies straightforwardly to birthpoints by symmetry. o 


Using Lemma one can generalize Lemma [3.8] above to the case where some lines in Lys have an empty 
intersection with supp(J), and then define a common location for all endpoints that belong to the convex hull of the 
same L-surrounding set, as we do in the following proposition. 


Proposition 3.9. Let I be an n-interval module. Let ô > 0, K be a compact set of R” and L be a 6-grid of K. Let l € L such 
that |L;| = 2”"1. Then, there exists a set B; (resp. Dı) such that for any x € conv(L;) N L[I] (resp. conv(L;) N U[I]), one has 
either x € B; (resp. Dı) or ||b} — dL|lo < 5, where B; (resp. Dı) is a rectangular set in R” that can be constructed from the 
birthpoints (bi ret, (resp. deathpoints (di,)rret,). Moreover, one has 


1. sup{t>0:x+t-1¢€ Bı} < 6 (resp. sup {t > 0:x+t-16€ Dı} < ô), and 
2. Bı (resp. Dı) is included in a ball of radius ô: there exists x; such that B; (resp. Di) € {y € R" : |ly— xill < ô} 


Proof. We first construct B; and D;, and then we will show items (1) and (2). 


Definition of B;, D;. Let first assume that x is in the interior of conv(L;), that we denote with conv(L;)°. Note that 
if there is a line lọ that is 6-comparable to ly, and such that BI) = Ø, then by Lemma one immediately has 


IIb} — d= ||o0 < 6. Hence, we now assume that the barcodes along any line that is 6-comparable to ly is not empty, which 
means that the hypotheses of Lemmaj3.8|are satisfied for x. Now, remark that since L is a grid, if one is able to find a line 
l’ in L whose intersections with hyperplanes associated to the canonical axes of R” are 6-close to x, then, since x is in 
the interior of an L-surrounding set L;, l’ must belong to that surrounding set L; as well. More formally, one has that, for 
any line l’ € L, 

d(x, NH) <ô => l'el, where H; = {y € R” : yi = xi}. 


This ensures that Lys (see Lemma|3.8} is included in L; for any x € conv(L;)°, and thus that we can safely define 


Dı := U recthull[ D} 5] and By := U recthull [Big]. 


xé€conv(L;)° xé€conv(L;)° 


Note that B; and D; depend only on the endpoints of the lines in L; (since Lys © L; for all x € conv(L,)°), and that 
dl € D; and bL € B; for any x € conv(L))° by Lemmaļ3.8] Furthermore, if x is in the closure of conv(L;), the previous 
statements still hold since D; and B; are closed sets. We now show that B; and D; satisfy items (1) and (2). 


Proof of (1). By applying Lemma|3.8]and its proof for dimension j = n to all x € conv(Z)), there exist deathpoints 


d’ and d,, that satisfy (dn)n = SUPgep, dn and (d,,)n = infaep, dn and (dp)n < Xn < (dn)n for all x € conv(L;). Moreover, 
these points are located on the lines I and l+ }ii<j<n 6e;, which are are 6-consecutive. Thus, applying Lemmaj|2.22}on 
this pair of line, we end up with D; having a diagonal smaller than ô. The same goes for birthpoints. 


Proof of (2). Note first that 


Dı Cc ly e R”:Vie [1,7], min(d}); < yi < max(di)i), 
lel; Vel; 
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and that, for any pair of lines l,l € L, there is a vector 0 such that lL = +0 with ||... < 6. Thus, one has 
TX :=7 +ô 2 0and IEN < 26. Moreover, l, can also be written as l = 1, + T; thus any two lines l; and J, in L; are 
26-consecutive. Now, for an arbitrary dimension i € |1, n], by applying Lemma|2.22]on the pair of lines l, l3 € Lı such 
that (di )i = mMinyeL (di)i and (dj )i = maxyer, CH one has that the difference of the i-th coordinates between any two 


points in D; is upper bounded by 26. Since this is true for any i, item (2) is true. The same goes for birthpoints. o 


Remark 3.10. These bounds are sharp in dimension n > 3: 


1. (1) Let ô > 0 and I be the interval with support supp(I) = {x € R” : (x,1) > ô}. Let I be the diagonal line passing 
through 0. Then, one has 


2 


ô ô 2. (n-2) 
H= (2), bise (Osgb sa Binge ~ 6,224, =5,— 


n 


ôl. 


— b! 


I 
Di se, l+ôez 


In particular, one has = ô using the lines / + ôe; and | + de that both belong to L). 


co 


2. (2) Let I be an interval whose support has a facet F of codirection different than n, and let J be a diagonal line such 
that {bi, L'E Lı} C F. Then the radius of the ball containing B; is exactly 2ô, as illustrated with the red and blue 
facets in Figure[5] 


4. Exact reconstruction 


In this section, we show that, under some assumptions on the family of lines that are used and on the underlying 
multipersistence module M, our approximation M computed by Algorithm|1|cannot be separated from M by both the 
interleaving and bottleneck distances (see Proposition|4.4]and Corollary . Roughly speaking, when M is interval 
decomposable, we need assumptions that ensure that all the facets of the summands of M can be identified with associated 
labels by Algorithm)3| This means that the facets have to be large enough with respect to the spacing between the lines 
in order to make sure that lines in surrounding sets can reach the same common facets. In addition to this, one also has 
to ensure that taking the minimum (resp. maximum) of birthpoints (resp. deathpoints) in surrounding sets, as prescribed 
by Algorithm|4] induces a corner that belongs indeed to the support of the multipersistence module. This means that 
the support of the module cannot contain holes of small size, such that a line could go through the hole and avoid the 
support, while all surrounding lines would intersect the support, which would lead to a fake corner. 

We now characterize those interval modules that satisfy the aforementioned informal assumptions. Given a size 
parameter 6 > 0, these interval modules form a subclass of the family of discretely presented interval modules, that we 
call the 6-discretely presented interval modules. 


Definition 4.1 (6-discretely presented interval module). Let K C R” be a compact rectangle of R”, and let I be a 
discretely presented interval module. Given ô > 0, we say that I is 6-discretely presented in K if: 


1. (Large facets) for each point x € L[I] (resp. U [I]) there exists, for each facet F containing x, an (n — 1)-hypercube 
Q% of side length 26 such that x € QF and QF C F; 


2. (Large holes) if there exists a diagonal line l such that | N supp(J) = Ø, then there exists an n-hypercube R of side 
length ô containing 0 such that for any line l’ in l + R, one has l’ N supp(J) = Ø; 


3. (Locally small complexity) any co-ball of radius ô, i.e., any set Bs(x) := {y € R” : d(x, y) < 5} for some x € R”, 
intersects at most one facet in L[I] (resp. U [I]) of any fixed codirection; 


4. (Compact description) each facet of I has a non-empty intersection with K. 


Assumptions 1 and 2 correspond to the assumptions mentioned at the beginning of the section, while Assumptions 3 
and 4 ensure that surrounding sets of lines can detect at most one facet associated to a given codirection at a time, and 
that critical points of I are all included in a rectangle respectively. 


Remark 4.2. One might wonder whether Assumption 2 and Assumption 3 are redundant with Assumption 1. In other 
words, one might wonder whether it is actually possible to define an interval module with large facets and small holes, 
or with large facets that can share the same codirection and lie close to each other at the same time. Even though this 
seems to be impossible when n = 2 (indicating that Assumption 2 and Assumption 3 might indeed be redundant with 
Assumption 1), it can definitely happen in dimension n > 3, as Figure|8]shows. 
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Figure 8: Example of interval module in dimension n = 3 with large facets, small holes and some facets with the same 
codirection close to each other. The support of the module can be constructed by taking the (closed) red and (open) green 
L-shaped sets on (Left), and glue them together as shown in (Middle). While arbitrarily large facets can be created using 
this construction, the resulting interval always contains a small hole and large facets of same codirection that are close to 
each other. Because of this, it is possible to find a (blue) diagonal line that goes through the support without intersecting 
it, while lines in its surrounding set will detect some facets. (Right) View of the interval from the top showing the hole 
and the spatially close facets (showed in bold font). This is an example where Assumptions 1 and 4 of Definition|4.1Jare 
satisfied, while Assumptions 2 and 3 are not. 


The main advantage of ô-grids and 6-discretely presented modules is that they ensure that Algorithm|3]can identify 
every single facet with a corresponding label. 


Lemma 4.3. Let ô > 0 and K be a compact rectangle of R”. Let I be a -discretely presented interval module in K, and let L 
be a d-grid of K”. Then, there is a bijection between the facets of I and the labels identified by Algorithm|3| 


Proof. We first prove the result for birthpoints and facets of L[J]. 

Let F be a facet of L[I]. Let Ip € L be a diagonal line intersecting F, and br € R” be the associated birthpoint. By 
Definition|4.1] item (1), there exists an (n — 1)-hypercube ger C F of side length 26 such that bp € ops . This ensures that 
for any dimension i that is not in the codirection: i € [[1, n]]\codir(F), one has either br + ôe; € oes or bp — de; € ee 
Since L is a 6-grid of K*°, and since ger is an (n — 1)-hypercube, there exists a line lọ € L such that Ip belongs to the 
surrounding set L, and such that the birthpoints corresponding to the lines in Ly are all in Bey . This means that codir(F) 
is detected as a label of br by Algorithm]3| 

Reciprocally, assume there exists a line lọ € L such that all birthpoints associated to the lines in the surrounding set 
Ln share a coordinate along dimension i € [[1, n]], so that i is a label detected by Algorithm|3| Then, the set of birthpoints 
Bı, has a minimal element, and thus its convex hull conv(By,, ) is in L{[I]. Since conv(B;, ) is an (n — 1)-hypercube of 
codirection i, it must be associated to a facet of L[I] of codirection i as well. 

The proof extends straightforwardly for deathpoints. Oo 


Now that we have proved that all facets can be detected with 6-grids and 6-discretely presented modules, we can 
state our first main result, which claims that it is possible to exactly recover the underlying module under the same 
assumptions. 


Proposition 4.4 (Exact recovery). Let é > 0 and K = Ra g be a compact rectangle of R”, where a < p. Let I be a 6-discretely 
presented interval module in K, and let L be a 5-grid of K”. Let Ch) and CE (I) be the L-birth and death corners of I 
computed by Algorithm and let I = Ind (Uceckin Ue eck o Ree) be the approximation computed by Algorithm Then, 


one has : N 
d(I, I) = d(I, I) = 0. (5) 


Proof. As interval modules are characterized by their support, it is enough to show that supp(I) = supp(J). In the 
following, we thus assume that supp(J) is closed in R”. 


We first show the inclusion supp(I) C supp(J). More specifically, we have to prove that the (finite, pseudo and 
infinite) L-corners computed by Algorithm |4]all belong to supp(I). A key argument that we will use several times comes 
from the following lemma, which allows for a local control of the boundary of supp(J) using the hyperplanes associated 
to specific L-corners. 


Lemma 4.5. Let b be a birthpoint (resp. deathpoint) of I in Kĉ, and ly € L be the line such that b € conv(L),) (this line 
exists since L fills K*°). Then, one has the following: 


1. for any facet F of L[I] (resp. U[I]) containing b, there exists a line lp € Ly such that bi, € F (resp. d, € F). 
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2. for any dimension i, there exists at most one facet of codirection i intersecting the set of birthpoints (resp. deathpoints) 


{bi :le Ln} (resp. {dj :1leE Ln}- 


3. let bin (resp. dip?) be the the pseudo or finite L-corner generated by Lı,. Then, one has: 


conv(L;,) N L[I] A Re LJ {x ER": x= bi} 
i€codir(b’) 

(resp. conv(L),) NU[I] N K c U {x eR" : x; =dj}). 
) 


i€codir(d’ 


Proof. We only show the result for birthpoints since the arguments for deathpoints are the same. Let b € L[I] bea 
birthpoint in Kĉ. 


Proof of (1). Let F be a facet containing b. According to Definition|4.1] item (1), there exists an (n — 1)-hypercube Qb 
of side length 2ô such that Qb C Fandb € Qb. Since L is a grid, there exists a line € L with dæ(b, 1) < 6 intersecting 
Qb. Now, since b € conv(L;,), one has d..(1 N Hr, Ly N Hr) < 6, where Hr is the hyperplane containing F; thus, L € Lp 
(the argument is the same than in the proof of Proposition[3.9] first paragraph). 


Proof of (2). By Proposition[3.9] item (2), the birthpoints associated to lines of Lı, are all contained in a ball of radius 
ô. Thus, the unicity of the facets with given codirection comes straightforwardly from Definition[4.1] item (3). 


Proof of (3). Note that the birthpoint b is obviously included in the facets of L[I] that contain it, which is a subset of 
the facets associated to the birthpoints of the lines in L. Now, as Lemma[4.3]ensures that the birthpoints associated to 
lines in Ly are correctly labelled, the pseudo or finite L-corner generated by Lj, must be on the intersection of the facets 
containing b. This ensures that 

be © {x € R": x; = bi}. 
i€codir(b’) 


Since these arguments do not depend on b € conv(L),), the result follows. Oo 


Now that we have Lemma|4.5] we can prove that finite, pseudo and infinite L-corners belong to supp(J). We will 
prove the results for birth corners, but the arguments for death corners are exactly the same. 


Finite and pseudo corners. Let b be a finite or pseudo L-birth corner, associated to a set of consecutive lines Ly 
for some line lọ € L. By assumption, each birthpoint b!, for l € L}, is nontrivial; and thus any birthpoint in conv(L),) is 
nontrivial as well, using Definition |4.1| item (2). Let l € conv(L;,) be the diagonal line passing through b. 

Using Lemma{|4.5| one has: 


b! € conv(L_) N LII] N K? c U {xe R”: x; =bi} |N1 = {b}. 
iccodir(b) 


Thus b = bi and b € supp(J). 


Infinite corners. Let b be an infinite L-birth corner, and let b’ be the corresponding minimal pseudo L-birth corner, 
associated to a set of consecutive lines Lı for some line lọ € L. We will show that, if j is a free coordinate of b’, i.e., if 
j € dir(b’), then bi, < a; (recall that K is the rectangle Ry,g). The reason we want to prove such inequalities is that they 
directly lead to the result. Indeed, if b; < a; for any j € dir(b’), then b’ — t X, jedir(») €j belongs to L[I] for any t > 0, 
since otherwise the line {b’ — t } jcair(y) €j : t > 0} would have to intersect a facet F C L[I] of codirection j for some 
j € dir(b’), which would not intersect K, contradicting Definition [4.1] item (4). 

Let j € dir(b’) be a free coordinate. By contradiction, assume that bi > aj, and let b? denote the pseudo L-corner 
generated by Lh-ôe;. In particular, this means that, for any / € Lọ, l — de; € Land Li-se, S L since L fills K?ĉ. Now, if 
for every line l € Ly such that I = Ip +7 with Uj = 0, one has that b! and bi se, are on the same facets, then one has 
bi se, = bi — de;, and the pseudo corner b’ is equal to b’ — ôe; by construction, as per Algorithm|4 Moreover, one has 
bi =b’- de; < b’, contradicting the fact that b’ is minimal. Hence, there is at least one line J € Lp, 1 = lo +0 with 
vj = 0, such that b! and bise, are not on the same facets, in other words, there exists a facet F; of L[I] of codirection 
j that intersects the (half-open) segment [b] — 6e;, bi). In order to locate that facet more precisely, we will prove the 
following lemma: 


Lemma 4.6. For anyi € |1,n]| and s,t € R such thats < t, one has (bl e i < (bl __ Ji. 


l-sei 
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Proof. Without loss of generality, assume s = 0. Since b! — te; € 1 — te;, it follows that b? — te; and b! | are comparable. 
g L I l-te; P 


Moreover, one must have bi — te; < bl ae otherwise one would have b! > b! — te; > bl iaa contradicting Remark If 


the points are equal, i.e., bi — te; = Bi see then one has (bI); > (biei Otherwise, if bl — te; < bire then 


Vk + i, (ble) > Dk- 


l-tei 


Moreover, since bl and bite; cannot be comparable as per Remark one must have (bitei < (b)i. 


Let H; = {x eR”:xj= cj} be the hyperplane associated to F;. Then, by Lemmaļ4.6] one has 
(bi-se,)j < cj < (bD 


Since the lines l and l — ôe; both belong to the surrounding set Lh-ôe;, it follows from Lemmas|4.3] and [4.5] item 
(3), that codir(b’) 2 codir(b’) U {j}. Moreover, since the facets of L[I] associated to codir(b’) are unique in a 6-ball 
around b/, as per Definition |4.1| item (3), they all have a unique associated value c; (corresponding to their associated 
hyperplanes). 

Finally, we will show that b/ < b’. Let i € [[1,n] be an arbitrary dimension. 


e Ifi € codir(b’), then b} = b’. 
e Ifi € codir(b/)\codir(b’), then b! E€ (ci, minier, se, ODi} < minjer, (bi): = b, with a strict inequality for i = j. 
- Ifi € dir(b’) C dir(b’), then b? = miner, se, (b])i < miner, (bj): = b. 

Hence, one always has bi < b;, and thus bÍ < b’, which contradicts the fact that b’ is minimal. Thus, one must have bi < dj. 


We now show that supp(I) € supp(J). Let x € supp(J). We will show that there exists an L-birth corner c such 
that c < x. Let H be the family of hyperplanes associated to the facets of L[I]. The corner c will be defined as the limit 
of a sequence of points {x® }, yin R”, defined by induction with: 


1. x = inf {x —t-1:t > 0}N supp(J). Then, one has the two following possibilities: 


e either x“) = —oo, and we let c := x. 


e or there exists a maximal subset of hyperplanes H! c H, H! + Ø, such that x € Nyen H =: Hi. Let 
J c |[1, n] be the set of free coordinates in Hj, i.e., those dimensions such that j € J! ==> x) — ej € Ay. 


2. x) = inf {x =i) jegi ej +t 2 o} N supp(J). Then, one has the two following possibilities: 


e either x) is at infinity in Hj, i.e., a = —% if je J! and E” = y otherwise, and we let c := x). 


e or there exists a maximal subset of hyperplanes H? 2 H! such that x € NyegeH =: Hz. Let J? C [1n] 
be the set of free coordinates in Hp, i.e., those dimensions such that j € J? <== x2) — ej € Hp. 


3. For k > 3, x+) = inf {x =t: Dyege Of VE 2 o} N supp(J). Then, one has the two following possibilities: 


e either x(**) is at infinity in Hx, i.e., a =-oifje JE and ie = a” otherwise, and we let c := x‘*t)), 


e or there exists a maximal subset of hyperplanes H**! 3 H* such that x) € Nyegea H =: Hpi. Let J! C 
1, n]] be the set of free coordinates in Hk+, i.e., those dimensions such that j € T! e x) ej € Ast. 


If this sequence stops at step one, i.e., c = x‘) = —oo, then every birthpoint of I is at —oo, the only birth corner is 
c = —oo, and one trivially has c < x. Hence, we assume in the following that c is obtained after at least one iteration of 
the sequence. Note that this sequence of points has length at most n. Let c~ and c be the penultimate and last elements of 
the sequence respectively, and let J7 be the set of free coordinates associated to c~. By construction, one has: 


esT <0 ox ox <x, 


We now show that c is indeed a birth corner. If c is finite, then it must belong to the intersection of n hyperplanes, and 
it is thus a finite birth corner. Hence, we assume now that c is not finite. We will construct a minimal pseudo birth 
corner from c7, and show that c is its associated infinite birth corner. We will consider two different cases, depending 
on whether c is close to K = Ryg or not. If c™ € Kĉ, the filling property of L and the size of the facets of L[I] ensure 
that c” is itself a minimal pseudo birth corner, associated to c, which is thus an infinite birth corner. If c~ ¢ Kĉ, then let 
T €R” be a vector that pushes back c™ into Kĉ, i.e., such that, for any dimension i € J, one has 


ai ô< (c +70); < ai; 


and v; = 0 if i ¢ J~. Let S be the segment [c7, c7 + 0]. We have the two following cases: 
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1. Assume S C L[I]. Then c +7 € supp(I) N Kĉ, and there exists a line L € L such that c7 +7 € conv(L;). Let 
c! be the pseudo birth corner associated to L;. Since one has ci < qj for any dimension j € J7, it follows that 


T C dir(c!). Furthermore, since c +7 belongs to the same facets than c and c7, and since c7 +7 € conv(L)) 

one has codir(c’) 2 codir(c) and dir(c) = J~. Thus, c is an infinite birth corner associated to the minimal pseudo 

birth corner c’. 

2. Assume S ¢ L[I]. In that case, there must be a facet of codirection j, for some j € J”, that intersects S. Since 
one has cy <s (c +0); < a; for any j € J”, this means that the facet would not intersect K, which yields to a 
contradiction as per Definition [4.1] item (4). 


This concludes that supp(I) € supp(J), and the equality between these supports holds. Oo 


Proposition|4.4]extends to the following corollary, whose proof is immediate from the definition of exact matchings 
(see Definition|3.5|above). 


Corollary 4.6.1. Let M be an interval decomposable multipersistence module, whose interval summands all satisfy the 
assumptions of Proposition|4.4| Let M be the multipersistence module computed by Algorithm|]| Then, one has 


d;(M, M) = d,(M, M) = 0. 


5. Multipersistence module approximation 


In this section, we propose an approximation result, which states that the bottleneck and interleaving distances between 
an interval decomposable multipersistence module M and its approximation M computed with Algorithmli}can be upper 
bounded under weaker assumptions than the ones in Proposition/4.4]In order to do this, we first characterize a family of 
approximation modules, that we call candidates in Section [.1| and whose distance to a target module can be controlled. 
Then, we show in Section [5.2]that the module approximation computed by Algorithm[1|belongs indeed to this family. 


5.1. Candidates and approximation error 


In this section, we define a family of good” candidate multipersistence modules (see Definition|5.1} for approximating 
an interval decomposable multipersistence module M, in the sense that d;(M, M) and d,(M, M) are upper bounded for 
any module M in this family. 


Support assumption. In order to simplify proofs, we assume in this section that supp(M) € K, where K is a compact 
set in R”. This assumption is used in practice, for instance in [CB20||CFK*19||Vip20b], where multipersistence modules 


are either finite or intersected with a compact set in order to generate descriptors. 


Candidates. We first define candidate modules, which are, roughly speaking, modules with the same fibered barcodes 
than M on a regular set of lines, paired with a candidate pairing that commutes with the exact matching induced by M. 


Definition 5.1 (Candidate). Let K be a compact set of R”, 6 > 0 and L be a 6-grid of K*®. Let M = Eiez Ti be an interval 
decomposable multipersistence module, with L-fibered barcode F B(M), = {B(M|,) :le L}. Let oy be its associated 


exact matching. An interval decomposable multipersistence module M = Die# I; is called an L-candidate of M if: 
Í. B(M|)) = B(M|,) for any l € L, i.e., their L-fibered barcodes are the same, and 


2. there exists a surjection v: I > Ī such that i ¢ coim(v) = d;(Ji,0) < ô, and such that, for any two consecutive 
lines 1,1’ € L, the following diagram commutes: 


B(M|,) —S 8(MI,) 


| je 
B(M|,) —“> 8l) 
where v: L|; € B(M|,) a Rol € B(M|)). In other words, M and M have the same matched barcodes along L, up 
to interval reordering. We call o the candidate interval pairing. 


We now claim that multipersistence modules that are L-candidates of a given interval decomposable multipersistence 
module M are 6 close to M, as stated in the following approximation result. 


Proposition 5.2 (Approximation result). Let K be a compact set of R”, 5 > 0 and L be a 6-grid afk. Let M be an interval 
decomposable multipersistence module. Then, any L-candidate M of M| . satisfies dr(M, Mx) < dp(M, My) < ô. 
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Proof. Let M = @®;ez Ii and M= Biez I, be the interval decompositions of M and M, and ø be the associated candidate 
interval pairing. Without loss of generality, assume that the support of M is included in K, i.e., M = M | x: In order to 
upper bound the bottleneck distance d,(M, M), one can upper bound the interleaving distance dr(I;, iw) for any index 
i € J. Let I and I be two such intervals (we drop the index i to alleviate notations). Since I and I are interval modules, 
and thus indicator modules, the morphisms ae I > I[6] and Che I — I[6] are well-defined, as per Definition]2.15). 
We thus need to show that these morphisms n i.e., that they induce a ô-interleaving. Hence, we first show that 


(5) (6) (26) 
(0) j (0?) = (f |. i 6) 


for any x € R”. 


Let x € K. If x € l for some line / € L, Equation (6) is satisfied from supp(I) N l = supp(J) N 1, which itself comes from 
the fact that M is an L-candidate of M. Hence, we assume in the following that x ¢ Uyerl. Furthermore, if x ¢ supp(J) or 
x +26 ¢ supp(J), then Equation (6) is trivially satisfied. Hence, we also assume x, x +26 € supp(J) € K. This means that 


bL and dl are well-defined, and that (o a = id, x. Thus we only have to show that Teas =k,ie,x+d€ supp(1). 


As L is a 6-grid, let l € L be a line such that x € conv(L;) and let ly © conv(L;) be the diagonal line passing through 
x. Now, as Rx x+5 © supp(J), Lemma]2.22]ensures that B(II, ) # Ø for any line l € L that is 6-comparable to ly; and the 
same holds for I since F B(1), = FB). Using Proposition|3.9on both I and J, there exist two sets B; and D; such that 
dl, d € D; and b! ble B;, with the segments ly N B; and ly N D; having length at most ô. Since one also has 


PK x?" 


bl <x<x+2ő < dl, 


ae a 


, 


< ô, one finally has b <x+8< d, which concludes that x + ô € supp(J). 


Remark 5.3. This bound is sharp up to a ł factor, as illustrated by Figure fo] 


NI 


h 


Figure 9: Two interval modules, one with support colored in red (Left) and the other in blue (Right). These modules 
have the same barcodes (green bars) along two 6-consecutive lines; and dp (L, I) = d(4, I) = 6/2. This construction can 
easily be generalized in R” with n > 2 by setting I; as the union of two hypercubes of side length 6/2 located on the 
anti-diagonal, and I as the standard hypercube with side length ô. 


5.2. Algorithm|1|provides a candidate 


In this section, we first show, given an interval module J (with support included in a compact K), that the approximation 
I computed by Algorithm|2|is an L-candidate of I (see Proposition [5.4}. This will in turn allow us to state our final 
approximation bound with Algorithm|]| 1| that is valid for any interval decomposable multipersistence module M (see 


Proposition|5.5). 


Proposition 5.4. Let I be an interval module with support in a compact set K G R”, 5 > 0, and L be a d-grid of K. Let I 
be the interval module computed with Algorithm|3 2| Then, I is an L-candidate of I. 


Proof. Let CRD) and axes) be the birth and death corners computed by Algorithm|4] i.e., one has 


Ī = Ind U U Ree |, (7) 
ceCR(D) deck U) 
In order to show that J is an L-candidate of I, we need to show that the L-fibered barcodes of I and I are the same, i.e., 


FB(1), = FB(1),. Equivalently, we need to show that supp (1) N L = supp(J) N l for any line | € L. We first show that 
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they share the same birthpoints, i.e., that L[I] N L = L[Ī] NL. Let l € L. Note that bi and bl are comparable since they 
belong to the same diagonal line l. 


Strategy. In order to show bl = bi; we are going to show that 1. b! < bl and 2. bl < bi. 


1. In order to show bi < bl, we are going to show that c £ b! for any corner c € CRC). Indeed, if one assumes bl > b, 


and since there always exists a birth corner c € CR) such that c < bl by construction of I, one has c < bÍ < bi. 


2. In order to show bl < bi, we are going to show that there exists a corner c € ci (I) such that c < bl. Indeed, if 
there is such a birth corner, and if bl > b! by contradiction, then c < bl < bl , and R, pi is not flat, contradicting 
Ga & 


Remark 


Proof of (2). By construction of I with Algorithm|4] if b! is labelled, then there exists a line l’ and a corner c” € ck(I) 
that is smaller than b! so we can take c := c”. If b! is not labelled, it belongs itself to CRD; and we can take c := bl. 


Proof of (1). Let c € Ci: (I) be a birth corner, and let Lı be the associated surrounding set of lines for some lọ € L. Let 
[c]; := min [(c + (R+)”) A I] be the smallest element in the intersection between the positive cone on c and J. Assume 
[c]; = bi and c < b! . Then Re,{c]; is not flat, contradicting the fact that [c]; is the smallest element. Thus, we only have to 
show [c]; > bi . There are two cases. 


e Either some birthpoints of L), are not labelled by Algorithm|3| and c is equal to bl, for some l’ € Lp. Now, assume 
[cli < bi by contradiction. Then bl, =c¢ < [c]; < bi . Thus bl, < bh and Ro! p! is not flat, contradicting Remark|2.20} 
Hence [|c]; > bl. 


e Or all the birthpoints of L} are labelled by Algorithm [B] Again, we study two separate cases. See Figure [10]for an 
illustration. 


- Either l € Lp. Then, Ji € |1, n]] such that (bl); = c;. This yields (bl); =c; < ({c]));, and thus [c]; > bi since 
they both belong to the same diagonal line l. 


— Or the line / does not belong to L}. Since [c]; is on the boundary of the positive cone based on c, there exists 
i € |[1,n]] such that ([c])); = ci. Assume again by contradiction that b! > [c];, and write 


[e]: =c + X aje; =: c+ <b! 
j+i 


with a; > 0 for j € [1,n]\{i}. Since 1 ¢ Lp, there exists some jọ such that aj, > 1. Let Ù := 
(@; mod 6) jefiny) = (CLel: — c); mod ô) jepiny € L0, 8)” < P. Let l’ := lz be the diagonal line pass- 
ing through c +7. Now, recall that the lines of L are drawn on a grid (see Remark [3.3}, so l’ € L since 
l’ =1+ a -7 . Moreover, one has by definition, c € conv(Lp). Since the lines of L are on a grid, one has 


Vh, lb € L, ||) O Hn, conv(Ly,) A Hn)llo < ô = lı € L, where H, = {x € R” : x, = cn}. 
Now, note that c +R andc+ t -Tn - 1 both belong to l’, and that c +r -Tp - 1 € Hn. Moreover, since 


(e+ @ -8n D) -= elf, = [E -Pn AI], < 8 


one has l’ € Lp. Thus, there exists i’ € [[1, n]] such that (bl) = cy < (c+), and thus bl, < (c+ 7) since 
bl, andc+ wu are comparable on the diagonal line l’. Finally, bl <ct+u<ct+v< bi, and Rpr b! is not flat, 
E 


Lo 
contradicting Remark|2.20 Hence, b! < [e]; 
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Figure 10: Illustration of I, l’, c, [c]1, [c] rvu, 0 


„b, bh, when one assumes that [|c]; < bi. 


The proof applies straightforwardly to deathpoints by symmetry. o 


Proposition 5.5 (Conclusion). Let M = @,.; I; be an interval decomposable multipersistence module. Let K be a compact 
rectangle in R”, 5 > 0, and let L be a -grid of K*°. Let M = Diet I; be the multipersistence module computed with 
Algorithm|i| Note that CI by construction. Then, 


1. ifa summand I; of M is 25-discretely presented, then d;(I;, I;) = 0. 
2. if a summand I; of M has a support included in K, then dj(Ij, Ij) < ô. 


3. ifasummand I; ofM is 6-trivial (i.e., dr(I;,0) < 4), then either supp(i) OL + Ø and thusi € Ī andi, is also 5-trivial, 
or supp(J;) N L = Ø and M has no summand matched to Ij. 


In particular, if K C R” contains the supports of those summands of M whose support is precompact, and if the remaining 
summands are all 5-discretely presented in K, then one has dj(M, M) < dp(M, M) < ô. 


Proof. Item (1) comes from Proposition|4.4] Item (2) is a direct consequence of Propositions[5.2|and[5.4] Item (3) comes 
from the construction of I;. o 


6. Exact matching 


The algorithms and theoretical results presented in the previous sections were all obtained using exact matchings (see 
Definition[3.5). In this section, we seek to understand conditions under which a given matching function is exact. We first 
present assumptions that allow for finding exact matching functions in Section|6.1). Then, we discuss these assumptions 


in Section and we finally show that the vineyard matching [CSEM06] is exact in Section[6.3] 


6.1. A naive approach to exact matching 


In order to understand which matching functions are exact, we first define a notion of compatibility between bars. 


Definition 6.1 (Compatible bars). Let I be an interval module, and let 44, l C R” be two -consecutive diagonal lines. 


Assume supp(I) + @ and supp(Ĵ + Ø, and let [bi d; | and [bi,. di] be the corresponding bars. These bars are compatible 


if the rectangles Ry! blo Ry! bls Ry dl and Ry qi are flat. Moreover, we say that [bi di | is compatible with the empty 
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set in ly if 


I _ gl 
by di. 


< 26. 
Remark 6.2. It follows from Lemma|2.9]that bars along consecutive lines that correspond to the same indicator summand 
of a multipersistence module are always compatible. 

Compatible bars enjoy some useful properties, that we state in the following proposition. 
Lemma 6.3. Let l, and l, be two 6-consecutive lines, and let [b,, di] := Bl) be the bar of an indicator module along lı. 
Let [b2, d2] be a bar along lz that is compatible with [b;, dı]. Then, dz (resp. bz) is included in a segment of size ô in lz that is 
independent of dz (resp. bz). 
Proof. Applying Lemma|2.22] one has 
dz EC: [Bs(di) Nb]\[{z€ R°:z>d}U{zeER":z<d,}| 

Since C is a nonempty, totally ordered set, we can define y := minC. By construction, we know that there exists a 


dimension i such that y; > (d,);, and thus C must be included in the segment [y, y + 6 - 1] along Jy. 
The proof applies straightforwardly to b by symmetry. Oo 
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Since bars that are matched under an exact matching function are always compatible, one way to construct an exact 
matching between two barcodes is therefore to isolate, among all possible matching functions, the ones such that matched 
bars are compatible. If this family contains a single element, it must be the exact matching we are looking for. This 
typically happens for interval decomposable multipersistence module whose summands are sufficiently separared, as we 
show in the proposition below. 


Proposition 6.4. Let M = @,,; I be an interval decomposable multipersistence module. Let ô > 0, and I, I’ be two interval 
summands in the decomposition of M. Assume that the two following properties are satisfied: 


1. Let! c R” be a diagonal line such that supp(I) A l + Ø and supp(I’) O l + Ø. Then, one has either ||b} - br || > dor 
CA - dl, > ô. In other words, the endpoints of the bar in B(I|,) and of the bar in B|) are at distance at least ô. 


2. The bars of length at most 26 in I andT' are at distance at least ô, i.e., if we let S! := li; LA supp(I) # Ø, lle} - di. < 25} 


(and similarly for I’), one has 
dæ(S!, ST) > 6/2. 


In other words, a small bar in I cannot be too close to a small bar in I’. 


Let K C R” be a compact set and L be a 6-grid of K. Then, the matching function Mcomp, induced by matching bars that 
are compatible together, is well-defined and exact. 


See Figure[11]for an illustration of assumptions (1) and (2). 


vy 


Figure 11: (Left) Example of module whose interval summands do not satisfy assumption (2). (Right) Example of 
module whose interval summands do satisfy assumptions (1) and (2). Bars corresponding to consecutive lines can only be 
matched if they are compatible, which, in this figure, means that they have the same color, i.e., that they are associated to 
the same interval summand. 


Proof. Let I and I’ be two interval summands in the decomposition of M. Let l and l, be two 6-consecutive lines of L, 
and let b := B(I|,) ) be the bar corresponding to I a lı. We will show that Mcomp must match b to either b’ := = B(I|,) 
if supp(I) A l2 + Ø, or the empty set if supp(J) N l = 


e Ifsupp(J) N l; = Ø, then by Lemmale.22 the length of b is at most 6, i.e., lo, - d; < ô. It is thus compatible 


with the empty set. Now, since d.o(l;, l2) = 5/2 and since 4 € S}, assumption (2) ensures that the bar b” := B|, 
(if it exists) must be of length at least 2ô. In particular, it is not compatible with b, hence Mcomp cannot match b to 
b”, and must match b to the empty set. 


l? “h 
Lemma it follows that the birthpoint and deathpoint of any bar along lz that is compatible to /; must belong 
to segments sp, sq of length ô that contain bi, and di, respectively. Let b” := (bi, f dj | be the bar in B(I'|,) (if it 
bi — pl 
l l 


that either by É Sp or dy ¢ sq. Hence b” is not compatible with b, and Mcomp must match b to b’. 


e If supp(I) Nl, + Ø, then the bar b’ = [b! ,d!] in B|) is compatible with b, as per sori According to 


> dor |Id! 


exists). According to assumption (1), we either have - di, 


| > ô. In particular this means 


co 


In both cases, Mcomp is well-defined and exact. oO 


Note that, since the number of lines of L, and thus their spacing ô, is controlled by the user in Algorithm{[1] one can 
ensure that the assumptions of Proposition[6.4ļare satisfied by asking L to have a sufficient number of lines in Algorithm[1] 
(which obviously increases the complexity as well, unfortunately). Note also that bars matched under the vineyard 
matching are always compatible (see Section[6.3}, which ensures that the vineyard matching is exact when L 
contains a sufficient number of lines. 
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6.2. Limitations 


One might wonder whether the usual distances between barcodes, such as the bottleneck or Wasserstein distances, could 
be used to define exact matching functions, instead of having to look for matching functions that only match bars that 
are compatible. Indeed, a major advantage of, e.g., Wasserstein distances, is that their associated matching functions is 
usually unique. However, when the spacing 6 between two lines is too large, Wasserstein distances can still fail to match 
bars exactly, if assumptions (1) or (2) are not satisfied, as shown in Figure|12| 


Figure 12: Example interval decomposable multipersistence module with two interval summands (green and purple), and 
its barcodes along two lines (here the two couples of red-blue bars). Any matching function induced by, e.g., Wasserstein 
distances between the barcodes, will match the red bar with the red bar and the blue bar with the blue bar; however, this 
matching is not exact. 


Moreover, even when the spacing 6 is small, it is easy to build examples where assumptions (1) and (2) are not 
satisfied. The toy example in Figure[13]shows that finding exact matching functions from bar compatibility alone can 
lead to poor results in general. 


Figure 13: The interval decomposable multipersistence modules on the (Left) and on the (Right) both have a yellow and 
a brown summand, and have the same fibered barcodes. Since bars corresponding to lines in the middle have multiplicity 
2, matching functions identified using bar compatibility can match them arbitrarily. 


Furthermore, even a single mistake in the matching between consecutive barcodes can lead to arbitrary different 
decompositions, as illustrated in Figure[14] 


YU; 


Figure 14: The modules on the (Left) and on the (Right) are both decomposable into two interval summands (yellow and 
brown). These modules, which are at a large bottleneck distance from each other, can be obtained from a single matching 
exchange in the middle of the small square. 


One way to handle these issues is to use the representative chains associated to the bars of the fibered barcode in order 
to find a matching. Indeed, given two bars in consecutive barcodes, one can compare representatives of their generator 
chains in order to check if they correspond to the same underlying interval summand, by assessing whether one chain 
can be obtained from the other through the addition of another positive chain that appeared earlier than the current 
chain (in its corresponding filtration). Note that computing barcodes and matching their bars through representative 
chains in two separate steps is not efficient: for simplicial complexes, the cost of computing the barcode on a given 
line is O(N?) (where N is the number of simplices), and checking if two generator chains are associated to the same 
summand takes O(N?) operations (which is the cost of applying Gaussian elimination on a column of the boundary 
matrix). Fortunately, the so-called vineyard algorithm can perform both operations at the same time, i.e., it can 
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reduce the boundary matrices of different lines, and retrieve matching functions between the barcodes of consecutive 
lines as a byproduct. We detail these statements in the next section. 


6.3. Vineyard matching 


In this section, we prove that the matching induced by the vineyard algorithm [|CSEM06] is an exact matching for 
multipersistence modules computed from simplicial complexes (although this seems to be common knowledge, we could 
not find a proof of this result in the literature). We first recall the basic notions of persistent homology from simplicial 


complexes in Section and then provide an analysis of the vineyard algorithm in Section 


6.3.1. Persistent homology of simplicial complexes. We assume in the following that the reader is familiar with 
simplicial complexes, boundary operators and homology groups, and we refer the interested reader to [Mun84| Chapter 
1] for a thorough treatment of these notions. The first important definition is filtered simplicial chain complexes. 


Definition 6.5. Let S be a simplicial complex, and f: S — R bea filtration function, i.e., f satisfies f(o) < f(t) when 
o Ct. Then, the filtered simplicial chain complex (S, f) is defined as (S, f) = ((Cz)zer, 1), where 


e Ct = (00,..., Oi) is the vector space over a field k whose basis elements are the simplices that have filtration values 
smaller than t, i.e., {00,..., oi} ={o € S: f(o) < t}, and 


e for any s < t, the map ı = 15: Cs — C; is the canonical injection. 


Note that f can be used to define an order on the simplices of S = {o;}N», by using the ordering induced by the 
filtration values. In other words, we assume in the following that f (o0) < f(o1) < +- < f(on). We also slightly abuse 
notations and define C; := (o0,...,0;) for any i € |[0, N]] and 


(Sf) =(@ SOS... Sey =()). (8) 


Then, applying the homology functor H, on this filtered simplicial chain complex yields the following one-dimensional 
persistence module 
H,(S, f) = 0 > H.(Co) > H: (C1) > ++: —> H. (Cy). 


An important theorem of (one-dimensional) persistent homology states that, up to a change of basis, it is possible to 
pair some chains together in order to define the so-called one-dimensional persistence barcode associated to the filtered 
simplicial chain complex. 


Theorem 6.6 (Persistence pairing, [dMV11| Theorem 2.6]). Given a filtered simplicial chain complex (S, f) = C1 => 
C2 +++ —> Cy and associated persistence module H, (S, f), there exists a partition |1, N]| = E u B u D, a bijective map 
Low : D > B, and a new basis ô, . . ., ôn of C, called reduced basis, such that: 


. Ci = (G4,..., 6), 
e ôe = 0 for anye € E, 


+ for anyd € D, one has d6,ow da) = 0, and d6q is equal to ôLow(a) up to simplification, i.e., there exists a set of indices 
bd(d) such that (i) j < Low(d) < d for any j € bd(d), and (ii) 064 = Gtow(d) + Xi jeba(d) Ôj- 


In particular, the chains {6; : j € EN[[1, iJ} U{6; : j € BN[1, i] and 3d > i s.t. Low(d) = j} form a basis of the simplicial 
homology groups H.(C;). Moreover, the chains {6; : j € BU E} are called generator chains while the chains {G; : j € D} 
are called relation chains. 

The multiset of bars B := {| f (cp), f(oa)] : b = Low(d)} U {[f (ae), +00) : e € E} is called the persistence barcode of 
the filtered simplicial chain complex (S, f) and of the single-parameter persistence module H; (S, f). 


Note that while the reduced basis {G), . . . , ¢y} does not need to be unique, the pairing map Low is actually independent 
of that reduced basis (see VIL1, Pairing Lemma)). 


6.3.2. Vineyard algorithm and matching. The vineyard algorithm is a method that allows to find reduced chain 
bases for filtered simplicial complexes whose simplex orderings only differ by a single transposition of consecutive 
simplices, that we denote by (ii+ 1). 


Proposition 6.7 ([CSEM06]). Let S = {0;, ..., on} be a (filtered) simplicial complex (with filtration function f ), and let 
B= {G,..., On} be a corresponding reduced chain basis. Let f : S — R be a filtration function that swaps the simplices at 
positions i and i+ 1, i.e., that induces a new filtered simplicial complex S = {o cros Oils Oise on}. Note that the swapped 
basis sw? (B) = {61,..., G41, 6j,..., ÖN } might not be a reduced basis for (S, f), since the pairing map Low might not be 
well-defined anymore. Fortunately, there exists a change of basis, called vineyard update, that turns sw} (B) into a reduced 


basis B of (S, Ĥ in O(N) time, and that comes with a bijective map vine : B > B, called vineyard matching. 
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Note that the vineyard matching can be straightforwardly extended to filtration functions whose induced ordering 
differs by a whole permutation from the one of the initial filtration function by simply decomposing the permutation 
into elementary transpositions (using, e.g., Coxeter decompositions). However, one might wonder whether the resulting 
vineyard matching depends on the decomposition of the permutation or not. While the vineyard matching seems to be 
independent from the decomposition that we used in our experiments, we leave this question as a conjecture for future 
work. 


Conjecture 6.8. Lett € Gy be a simplex permutation of a (filtered) simplicial complex S = {0;, . . ., on }. Then, the vineyard 
matching between the reduced bases of S and S = {o,(1),..-;Or(n)} does not depend on any sequence J = {i,,...,i,} such 


that t = JJ} (ij ij + 1). 


6.3.3. Application to multipersistence. When the simplices of S are (partially) ordered from a function f : S > R”, 
i.e., one has f(r) < f(o) for any t C o (where < is the partial order of R”), then the function f is called a multi-parameter 
filtration function of S. In that case, applying the homology functor also leads to simplicial homology groups {Hg(C;) } ier, 
where I C |[1, N]” is the (partial) ordering associated to f, and these groups are connected by morphisms as long as 
their indices are comparable in R”. These groups and maps are called the multi-parameter persistent homology associated 
to the multi-filtered simplicial chain complex (S, f). Similarly to the single-parameter case, when k is a field, one can 
define the multi-parameter persistence module associated to (S, f) as the family of vector spaces indexed over R” defined 
with the identifications M; := Hy(C) where C = Vect{ao € S : f(o) < s € R"}. 

We will now show that using Conjecture|6.8| one can prove that the vineyard algorithm yields an exact matching in 
the sense of Definition 


Proposition 6.9. Let M be an interval decomposable multipersistence module computed from a finite multi-filtered simplicial 
chain complex (S, f) over R”, with support included in a compact set K of R”. Let 5 > 0 and L be a 6-grid of K*°. Assume 
that, for any two lines l,l’ € L, there exists a sequence l = |,...,], = l’ such that the simplex orderings induced by l; and lj, 
differ by at most one transposition of two consecutive simplices, for any i € [|1,k — 1]]. Then, the vineyard matching is exact. 


Proof. Let l,l’ be two diagonal lines in R”, and let F := fi; : S > Rand F’ := fly — R. Up to a reordering of the 
simplices of S = {01, ..., oy }, we assume without loss of generality that F(o1) < --- < F(on). Let Sy C Gy be the 


subset of permutations that satisfies T € Se, => (ci) = 
kel1,N] 


N= (Nrk) (,k)eSx[ LN] where N, = H,(C;) be the multipersistence module indexed over Se x [[1, N], with arrows 


(fom. oes Or(k) }) reps] is a filtration of S. Finally, let 


induced by inclusion. Note that the one-dimensional persistence module Nia := (Niak)xef1,N] İs isomorphic to M|, 

Now, let t be the simplex permutation that matches the simplex ordering induced by F to the one induced by F’. By 
definition, one has t € G? Let i be the index of the first transposition in the Coxeter decomposition of t. Then, the 
matching vine associated to the transposition (ii + 1) induces a matching between the bars of B(M|,) and B(M|,,), or 
equivalently, between the bars of B( Nia) and B(N(j;i+1)), as well as a morphism between Nig and Ni; ;+1). Let us now 
consider the diamond: 


N(iis1),i 
ee ——> Nidi-a = N(ii+1),i-1 Niai+t = N(ii+1),i41 — -°- (9) 
Niai 


First, note that all maps in that diamond are either injective of corank 1 or surjective of nullity 1 (since they are all 
induced by adding a positive or negative chain of the corresponding reduced bases). Moreover, by the Mayer-Vietoris 
theorem, the following sequence is exact: 


i ii id 9) y-x i 
Niai-1 = Nii+1),i-1 = H. (C$) > Niir), ® Niai = Hace) @ HCH ——> H, (C4) = Nam = N(iist),i+1- 


Such diamonds are called transposition diamonds, and it has been shown in Theorem 2.4] that the morphism 
induced by vine between the lower and upper parts of the diamond matches bars with same representative positive 
chains together. Hence, bars in B (Nia) and B(N(ji+1)) that are matched under vine are associated to the same summand 
of N. Moreover, by repeating this argumentation with the other transpositions in the decomposition of r, one has that 
the same is true for bars in B(M|)) and B(M|,,). 

Now, if N is interval decomposable, then, since the transition maps of M can be seen as transition maps of N, it follows 
that interval summands of M correspond to interval summands of N. In that case, using a dimensionality argument 
with Theorem one can show that two bars of B(M | 1) and B(M | y) that are matched under vine through arrows of N 
also belong to the same interval summand of M. Unfortunately, even though N and M are constructed from the same 
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chain complex, N contains much more arrows that M, and we cannot guarantee in the general case that N is interval 
decomposable. 

We will thus conclude with Conjecture [6.8] By assumption, there exists a sequence of diagonal lines between / and I’ 
such that the simplex orderings of two consecutive lines differ by at most a single transposition. The vineyard matching 
vine associated to these transpositions induce morphisms of M that can be seen as morphisms of N, thus ensuring that it 
is exact on both M and N between l and l’. Hence, by Conjecture|6.8] the vineyard matching is unique and exact. 

Oo 


Remark 6.10. Note that the assumption of Proposition[6.9]is always satisfied when 6 becomes smaller than the smallest 
distance (in filtration function values) between critical points of the module. 


7. Experiments 


In this section, we showcase the performances of Algorithm[i]on various data sets. More precisely, we evaluate the 
running times and approximation errors of our approximation scheme on both synthetic and real data sets, and we 
measure the empirical dependencies on the number of simplices, on the number of lines that are used, and on the 
dimension n. We also compare our approach to Rivet when n = 2. All experiments were done on a laptop with AMD 


Ryzen 4800 CPU. Our code is publicly available at https://gitlab.inria.fr/dloiseau/multipers| and is implemented in 


C+, with Python interface. 


7.1. Simple examples 


In this section, we first provide examples of multipersistence module approximation when the underlying multipersistence 
module is manually crafted and known. In Figures [15]and [16] we provide two examples of pairs of distinct interval 
decomposable multipersistence modules that have the same pointwise Betti numbers and rank invariants. In both 
examples, our approximation scheme manages to recover the correct decompositions. In Figure we provide a 
multipersistence module that is not interval decomposable, and our (fake) candidate decomposition. 
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Figure 15: (Top) Two distincts interval decomposable modules having the same rank invariant. (Bottom) Output of 
Algorithm|1] each color corresponds to a different summand. 
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Figure 16: (Top) Two distincts interval decomposable modules having the same rank invariant. (Bottom) Output of 
Algorithm|1] each color corresponds to a different summand. 


Figure 17: (Top left) Filtered simplicial chain complex that leads to the (Top right) indecomposable multipersistence 
module. (Bottom) Output of Algorithm|i] each color corresponds to a different summand. The decomposition is not real, 
although our output still preserves the rank invariant. 


7.2. Convergence 


In this section, we check the empirical convergence of Algorithm[i}on real data sets. We look at a noisy circle with 1, 000 
points (60% of those are on the annulus, and the remaining 40% are outliers in the square) in Figure [18] as well as three 
time series from the UCR archive that were embedded in R° using time delay embedding in Figures[19] [zo]anaf21] 
Our approximation was computed with n = 2 filtrations, the Alpha complex filtration and a log-density estimation. Since 
we do not know the underlying multipersistence module, we use the Euclidean norm between (a) the multipersistence 
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image of our approximation and (b) a limit multipersistence image computed on our approximation with a 
limit precision (i.e., distance between consecutive lines) of 6 = 10~* as a proxy to measure the distance between our 
approximation and the true underlying module. In all cases, one can see that the error curve decreases fast and smoothly, 
as predicted by our approximation result Proposition|5.5] 
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Figure 18: (Left) Noisy annulus data set colored by log density. (Middle) Multipersistence image in dimensions 0 and 1. 
(Right) Error plot showing convergence. 
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Figure 19: (Left) Multipersistence image in dimensions 0 (left), 1 (up right) and 2 (bottom right). (Right) Error plot 
showing convergence for the first time series of the Coffee data set. 
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Figure 20: (Left) Multipersistence image in dimensions 0 (left), 1 (up right) and 2 (bottom right). (Right) Error plot 
showing convergence for the first time series of the Worms data set. 
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Figure 21: (Left) Multipersistence image in dimensions 0 (left), 1 (up right) and 2 (bottom right). (Right) Error plot 
showing convergence for the first time series of the Wine data set. 
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Figure 22: (Left) Multipersistence image in dimensions 0 (left), 1 (up right) and 2 (bottom right). (Right) Error plot 
showing convergence for the first time series of the Ham data set. 


7.3. Performance 


In this section, we empirically check the dependencies between running time and numbers of lines, simplices and 
dimensions. 


7.3.1. Synthetic data with n = 2. We first focus on two synthetic data sets: (a) the noisy annulus of Section[7.2] and 
(b) a random point cloud in the unit square [0, 1]? with one filtration being the usual Alpha complex filtration, and the 
other being the lower star filtration of a random function on the points. We show the influence of the number of lines 
and simplices on the running times for both data sets in Figure [23] As expected, the running time is linear w.r.t. the 
number of lines. Furthermore, one can see that the linear coefficient depends on the complexity of the data set, as we can 
see that the random filtration on the points of the square yields longer running times than the noisy annulus. As for the 
number of simplices, we empirically noted a quadratic dependency with the running times. 
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Figure 23: Running times for the noisy annulus (red) and the random point cloud in the unit square (blue). (Top) Running 
times with respect to the number of lines of the two datasets; the number of points is here fixed at 1, 000 points. (Bottom 


left) Running times with respect to the number of simplices; the number of lines is here fixed at 1, 000 lines. We also 
show the curve in log-log scale in (Bottom right). 


7.3.2. Higher dimension. To illustrate the fact that Algorithm|1}can run with more than two filtrations, we now focus 
on a synthetic data set. We uniformly sample 300 points in the unit square [0,1]*, and then compute its Alpha complex. 
Finally, we assign to each vertex a random function value in [0,1]”, and compute the approximate multipersistence 
module (with 6 = 0.7) induced by the lower-star filtration of this random function. We repeated this experiment for 
several numbers of dimensions n, and show the result in Figure|24| As expected, there is an exponential scaling with 
respect to the dimension when 6 is fixed. 
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Figure 24: Running time w.r.t. the number of dimensions. Note that since the precision ô is fixed, the number of lines 
grows exponentially with the number of dimensions. 


7.3.3. Comparison with Rivet when n = 2. As mentioned in the previous sections, Rivet [LW15] is a tool for 
computing minimal presentations of 2-multipersistence modules. We provide performance comparisons between Rivet 


and Algorithm|i]in the tables below. 


In this first table, we focus on the noisy annulus of Section{7.2| with 80% of the points uniformly sampled on the 
annulus, and 20% are outliers in the square; and the same filtrations than in Section|7.2| 
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n #simplices | Rivet Rivet, peak RAM Alg.[i] ô = 0.01 Alg.{t] ô = 0.001 Alg.[i] peak RAM 
100 563 0.02s 9MB 0.004s 0.03s 140MB 
1 000 5 943 0.49s 220MB 0.18s 0.69s 150MB 
5 000 29 907 22.13s 5.41GB 4.15s 8.65s 180MB 
7 000 41 879 59s 10.29GB 8.00s 14.39s 187MB 
10 000 59 887 OOM 12.8GB 16s 26s 204MB 
20 000 119 831 - E 71s 85s 237MB 


One can see that Algorithm[i]significantly outperforms Rivet both in terms of RAM usage and running time. 


In our second table, we focus on the first time series from the data set Coffee, and processed it as in Section|7.2] Then, 
we used the Vietoris-Rips filtration as the first filtration, and density estimation as the second one. 


threshold #simplices | Rivet Rivet, peak RAM Alg.{i] ô = 0.01 Alg.[t] peak RAM 
0.1 9 961 0.28s 38MB 0.96s 170MB 
0.2 35 620 0.79s 80MB 12s 201MB 
0.3 71 230 1.45s 122MB 48s 281MB 
0.4 114 144 2.68 166MB 124s 396MB 
0.5 168 513 5.1s 219MB 263s 576MB 


One can see that while Rivet works remarkably well on flag complexes such as Vietoris-Rips, Algorithm[I]is still able to 
run in a reasonable amount of time. 


8. Conclusion 


In this article, we presented an algorithm for approximating any n-multipersistence module, whose complexity, running 
time, and approximation error can be controlled by user-defined parameters. We then showcased the performances 
of our method on synthetic and real data sets, and provided our code in an open-source package available at https: 
//gitlab.inria.fr/dloiseau/multipers 

Several questions remain open for future work. While we proved that our candidate has bounded approximation 
error when approximating interval decomposable modules, can we prove that it is optimal (in some way) among the 
family of interval decomposable modules when the input is not interval decomposable? What are the properties of this 
output in the general case, and can it be used in practice instead of the original (non interval decomposable) module? 
Finally, can we use the stability results that are now becoming available for specific multipersistence modules to 
infer confidence regions and convergence rates for our candidate decompositions? 
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